Thermotoga maritima delta prime polymerase subunit and use thereof

ABSTRACT

The present invention relates to an isolated DNA molecule from a thermophilic bacterium which encodes a DNA polymerase III-type enzyme subunit. Also encompassed by the present invention are host cells and expression system including the heterologous DNA molecule of the present invention, as well as isolated replication enzyme subunits encoded by such DNA molecules. Also disclosed is a method of producing a recombinant thermostable DNA polymerase III-type enzyme, or subunit thereof, from a thermophilic bacterium, which is carried out by transforming a host cell with at least one heterologous DNA molecule of the present invention under conditions suitable for expression of the DNA polymerase III-type enzyme, or subunit thereof, and then isolating the DNA polymerase III-type enzyme, or subunit thereof.

[0001] The present application is a continuation of U.S. patentapplication Ser. No. 09/716,964, filed Nov. 21, 2000, which is acontinuation-in-part of U.S. patent application Ser. No. 09/642,218,filed Aug. 18, 2000, as a continuation of U.S. patent application Ser.No. 09/057,416 filed Apr. 8, 1998, which claims the benefit of U.S.Provisional Patent Application Serial No. 60/043,202 filed Apr. 8, 1997,all of which are hereby incorporated by reference in their entirety.

[0002] The present invention was made with funding from NationalInstitutes of Health Grant No. GM38839. The United States Government mayhave certain rights in this invention.

FIELD OF THE INVENTION

[0003] The present invention relates to thermostable DNA polymerasesand, more particularly, to such polymerases as can serve as chromosomalreplicases and are derived from thermophilic bacteria. Moreparticularly, the invention extends to DNA polymerase III-type enzymesfrom thermophilic bacteria, including Aquifex aeolicus, Thermusthermophilus, Thermotoga maritima, and Bacillus stearothermophilus, aswell as purified, recombinant or non-recombinant subunits thereof andtheir use, and to isolated DNA coding for such polymerases and theirsubunits. Such DNA is obtained from the respective genes (e.g., dnaX,holA, holB, dnaA, dnaN, dnaQ, dnaE, ssb, etc.) of various thermophiliceubacteria, including but not limited to Thermus thermophilus, Aquifexaeolicus, Thermotoga maritima, and Bacillus stearothermophilus.

BACKGROUND OF THE INVENTION

[0004] Thermostable DNA polymerases have been disclosed previously asset forth in U.S. Pat. No. 5,192,674 to Oshima et al., U.S. Pat. Nos.5,322,785 and 5,352,778 to Comb et al., U.S. Pat. No. 5,545,552 toMathur, and others. All of the noted references recite the use ofpolymerases as important catalytic tools in the practice of molecularcloning techniques such as polymerase chain reaction (PCR). Each of thereferences states that a drawback of the extant polymerases are theirlimited thermostability, and consequent useful life:in the participationin PCR. Such limitations also manifest themselves in the inability toobtain extended lengths of nucleotides, and in the instance of Taqpolymerase, the lack of 3′ to 5′ exonuclease activity, and the drawbackof the inability to excise misinserted nucleotides (Perrino, 1990).

[0005] More generally, such polymerases, including those disclosed inthe referenced patents, are of the Polymerase I variety as they re often90-95 kDa in size and may have 5′ to 3′ exonuclease activity. Theydefine a single subunit with concomitant limits on their ability tohasten the amplification process and to promote the rapid preparation oflonger strands of DNA.

[0006] Chromosomal replicases are composed of several subunits in allorganisms (Kornberg and Baker, 1992). In keeping with the need toreplicate-long chromosomes, replicases are rapid and highly processivemultiprotein machines. Cellular replicases are classically comprised ofthree components: a clamp, a clamp loader, and the DNA polymerase(reviewed in Kelman and O'Donnell, 1995; McHenry, 1991). For purposes ofthe present invention, the foregoing components also serve as a broaddefinition of a “Pol III-type enzyme”.

[0007] DNA polymerase III holoenzyme (Pol III holoenzyme) is themulti-subunit replicase of the E. coli chromosome. Pol III holoenzyme isdistinguished from Pol I type DNA polymerases by its high processivity(>50 kbp) and rapid rate of synthesis (750 nts/s) (reviewed in Kornbergand Baker, 1992; Kelman and O'Donnell, 1995). The high processivity andspeed is rooted in a ring shaped subunit, called β, that encircles DNAand slides along it while tethering the Pol III holoenzyme to thetemplate (Stukenberg et al., 1991; Kong et al., 1992). The ring shaped βclamp is assembled around DNA by the multisubunit clamp loader, called γcomplex. The γ complex couples the energy of ATP hydrolysis to theassembly of the β clamp onto DNA. This γ complex, which functions as aclamp loader, is an integral component-of the Pol III holoenzymeparticle. A brief overview of the organization of subunits within theholoenzyme and their function follows.

[0008] Pol III holoenzyme consists of 10 different subunits, some ofwhich are present in multiple copies for a total of 18 polypeptidechains (Onrust et al., 1995). The organization of these subunits in theholoenzyme particle is illustrated in Fig. 1. As depicted in thediagram, the subunits of the holoenzyme can be grouped functionally intothree components: 1) the DNA polymerase III core is the catalytic unitand consists of the α (DNA polymerase), ε (3′-5′ exonuclease), and θsubunits (McHenry and Crow, 1979), 2) the β “sliding clamp” is the ringshaped protein that secures the core polymerase to DNA for processivity(Kong et al., 1992), and 3) the 5 protein γ complex (γδδ′χψ)is the“clamp loader” that couples ATP hydrolysis to assembly of β clampsaround DNA (O'Donnell, 1987; Maki et al., 1988). A dimer of the τsubunit acts as a “macromolecular organizer” holding together twomolecules of core (Studwell-Vaughan and O'Donnell, 1991; Low et al.,1976) and one molecule of γ complex forming the Pol III* subassembly(Onrust et al., 1995). This organizing role of τ to form Pol III* isindicated in the center of FIG. 1. Two β dimers associate with the twocores within Pol III* to form the holoenzyme, which is capable ofreplicating both strands of duplex DNA simultaneously (Maki et al.,1988).

[0009] The DNA polymerase III holoenzyme assembles onto a primedtemplate in two distinct steps. In the first step, the γ complexassembles the β clamp onto the DNA. The γ complex and the corepolymerase utilize the same surface of the β ring and they cannot bothutilize it at the same time (Naktinis et al., 1996). Hence, in thesecond step the γ complex moves away from β thus allowing access of thecore polymerase to the β clamp for processive DNA synthesis. The γcomplex and core remain attached to each other during this switchingprocess by the τ subunit organizer.

[0010] The γ complex consists of 5 different subunits (γ₂₋₄δ₁δ′₁χ₁ψ₁).An overview of the mechanism of the clamp loading process follows. The δsubunit is the major touch-point to the β clamp and leads to ringopening, but δ is buried within γ complex such that contact with β isprevented (Naktinis et al., 1995). The γ subunit is the ATP interactiveprotein but is not an ATPase by itself (Tsuchihashi and Kornberg, 1989).The δ′ subunit bridges the δ and γ subunits resulting in a γδδ′ complexthat exhibits DNA dependent ATPase activity and is competent to assembleclamps on DNA (Onrust et al., 1991). Upon binding of ATP to γ, a changein the conformation of the complex exposes δ for interaction with β(Naktinis et al., 1995). The function of the smaller subunits, χ and ψ,is to contact SSB (through χ) thus promoting clamp assembly and highprocessivity during replication (Kelman and O'Donnell, 1995).

[0011] The three component Pol III-type enzyme in eukaryotes contains aclamp that has the same shape as E. coli β, but instead of a homodimerit is a heterotrimer. This heterotrimeric ring, called PCNA(proliferating cell nuclear antigen), has 6 domains like β, but insteadof each PCNA monomer being composed of 3 domains and dimerizing to forma 6 domain ring (e.g., like β), the PCNA monomer has 2 domains and ittrimerizes to form a 6 domain ring (Krishna et al., 1994; Kuriyan andO'Donnell, 1993). The chain fold of the domains are the same inprokaryotes (β) and eukaryotes (PCNA); thus, the rings have the sameoverall 6-domain-ring shape. The clamp loader of the eukaryotic PolIII-type replicase is called RFC (Replication factor C) and it consistsof subunits having homology to the γ and δ′ subunits of the E. coli γcomplex (Cullmann et al., 1995). The eukaryotic DNA polymerase III-typeenzyme contains either of two DNA polymerases, DNA polymerase δ and DNApolymerase ε (Bambara and Jessee, 1991; Linn, 1991; Sugino, 1995). It isentirely conceivable that yet other types of DNA polymerases canfunction with either a PCNA or β clamp to form a Pol III-type enzyme(for example, DNA polymerase II of E. coli functions with the β subunitplaced onto DNA by the γ complex clamp loader) (Hughes et al., 1991;Bonner et al., 1992). The bacteriophage. T4 also utilizes a Pol III-type3-component replicase. The clamp is a homotrimer like PCNA, called gene45 protein (Young et al., 1992). The gene 45 protein forms the same6-domain ring structure as β and PCNA (Moarefi et al., 2000). The clamploader is a complex of two subunits called the gene 44/62 proteincomplex. The DNA polymerase is the gene 43 protein and it is stimulatedby the gene 45 sliding clamp when it is assembled onto DNA by the 44/62protein clamp loader. The Pol III-type enzyme may be either boundtogether into one particle (e.g., E. coli Pol III holoenzyme), or itsthree components may function separately (like the eukaryotic PolIII-type replicases).

[0012] There is an early report on separation of three DNA polymerasesfrom T.th. cells, however each polymerase form was reminiscent of thepreexisting types of DNA polymerase isolated from thermophiles in thateach polymerase was in the 110,000-120,000 range and lacked 3′-5′exonuclease activity (Ruttimann et al., 1985). These are well below themolecular weight of Pol III-type complexes that contain in addition tothe DNA polymerase subunit, other subunits such as γ and τ. Although thethree polymerases displayed some differences in activity (column elutionbehavior, and optimum divalent cation, template, and temperatures) itseems likely that these three forms were either different repair typepolymerases or derivatives of one repair enzyme (e.g., Pol I) that wasmodified by post translational modification(s) that altered theirproperties (e.g. phosphorylation, methylation, proteolytic clipping ofresidues that alter activity, or association with different ligands suchas a small protein or contaminating DNA). Despite this previous work, itremained to be demonstrated that thermophiles harbor a Pol III-typeenzyme that contain multiple subunits such as γ arid/or τ, functionedwith a sliding clamp accessory protein, or could extend a primer rapidlyand processively over a long stretch (>5 kb) of ssDNA (Ruttimann et al.,1985).

[0013] Previously, it was not known what polymerase thermophilicbacteria used to replicate their chromosome since only Pol I typeenzymes have been reported from thermophiles. By distinction,chromosomal replicases, such as Polymerase III, identified in E. coli,if available in a thermostable bacterium, with all its accessorysubunits, could provide a great improvement over the Polymerase I typeenzymes, in that they are generally much more efficient—about 5 timesfaster—and much more highly processive. Hence, one may expect faster andlonger chain production in PCR, and higher quality of DNA sequencingladders. Clearly, the ability to practice such synthetic techniques asPCR would be enhanced by these methods disclosed for how to obtain genesand subunits of DNA polymerase III holoenzyme from thermophilic sources.

[0014] The present invention is directed to achieving these objectivesand overcoming the various deficiencies in the art.

SUMMARY OF THE INVENTION

[0015] In accordance with the present invention, DNA Polymerase III-typeenzymes as defined herein are disclosed that may be isolated andpurified from a thermophilic bacterial source, that display rapidsynthesis characteristic of a chromosomal replicase, and that possessesall of the structural and processive advantages sought and recitedabove. More particularly, the invention extends to thermostablePolymerase III-type enzymes derived from thermophilic bacteria thatexhibit the ability to extend a primer over a long stretch (>5 kb) ofssDNA at elevated temperature, the ability to be stimulated by a cognatesliding clamp (e.g., β) of the type that is assembled on DNA by a‘clamp’ loader.(e.g., γ complex), and have clamp loading subunits thatshow DNA stimulated ATPase activity at elevated temperature and/or ionicstrength. Representative thermophile polymerases include those isolatedfrom the thermophilic eubacteria Aquifex aeolicus (A.ae. polymerase) andother members of the Aquifex genus; Thermus thermophilus (T.th.polymerase), Thermus favus (Tfl/Tub polymerase), Thermus ruber (Trupolymerase), Thermus brockianus (DYNAZYME™ polymerase), and othermembers of the Thermus genus; Bacillus stearothermophilus (B.st.polymerase) and other members of the Bacillus genus; Thermoplasmaacidophilum (Tac polymerase) and other members of the Thermoplasmagenus; and Thermotoga neapolitana (Tne polymerase; see WO 96/10640 toChatteree et al.), Thermotoga maritima (Tma polymerase; see U.S. Pat.No. 5,374,553 to Gelfand et al.), and other species of the Thermotogagenus (Tsp polymerase). In a preferred embodiment, the thermophilicbacteria comprise species of Aquifex, Thermus, Bacillus, and Thermotoga,and particularly A.ae., T.th., B.st., and Tma.

[0016] A particular Polymerase III-type enzyme in accordance with theinvention may include at least one of the following sub-units:

[0017] A. a γ subunit having an amino acid sequence corresponding toSEQ. ID. Nos. 4 or 5 (T.th.);

[0018] B. a τ subunit having an amino acid sequence corresponding toSEQ. ID. No. 2 (T.th.), SEQ. ID. No. 120 (A.ae.), SEQ. ID. No. 142(T.ma.) or SEQ. ID. No. 182 (B.st.);

[0019] C. a ε subunit having an amino acid sequence corresponding toSEQ. ID. No. 95 (T.th.), SEQ. ID. No. 128 (A.ae.), or SEQ. ID. No. 140(T.ma.);

[0020] D. a α subunit including an amino acid sequence corresponding toSEQ. ID. No. 87 (T.th.), SEQ. ID. No. 118 (A.ae.), SEQ. ID. No. 138(T.ma.), or SEQ. ID. Nos. 184 (PolC which has both α and ε activity,B.st.);

[0021] E. a β subunit having an amino acid sequence corresponding toSEQ. ID. No. 107 (T.th.), SEQ. ID. No. 122 (A.ae.), SEQ. ID. No. 144(T.ma.), or SEQ. ID. No. 174 (B.st.);

[0022] F. a δ subunit having an amino acid sequence corresponding toSEQ. ID. No. 158 (T.th.), SEQ. ID. No. 124 (A.ae.), SEQ. ID. No. 146(T.ma.) or SEQ. ID. No. 178 (B.st.);

[0023] G. a δ′ subunit having an amino acid sequence corresponding toSEQ. ID. No. 156 (T.th.), SEQ. ID. No. 126.(A.ae.), SEQ. ID. No. 148(T.ma.) or SEQ. ID. No. 180 (B.st.);

[0024] variants, including allelic variants, muteins, analogs andfragments of any of subparts (A) through (G), and compatiblecombinations thereof, capable of functioning in DNA amplification andsequencing.

[0025] The invention also extends to the genes that correspond to andcan code on expression for the subunits set forth above, and accordinglyincludes the following: dnaX, holA, holB, dnaQ, dnaE, dnaN, and ssb, aswell as conserved variants and active fragments thereof.

[0026] Accordingly, the Polymerase III-type enzyme of the presentinvention comprises at least one gene encoding a subunit thereof, whichgene is selected from the group consisting of dnaX, holA, holB, dnaQ,dnaE and dnaN, and combinations thereof. More particularly, theinvention extends to the nucleic acid molecule encoding the γ and τsubunits, and includes the dnaX gene which has a nucleotide sequence asset forth herein, as well as conserved variants, active fragments andanalogs thereof. Likewise, the nucleotide sequences encoding the atsubunit (dnaE gene), the ε subunit (dnaQ gene), the β subunit (dnaNgene), the δ subunit (holA gene), and the δ′ subunit (holB gene) eachcomprise the nucleotide sequences as set forth herein, as well asconserved variants, active fragments and analogs thereof. Thosenucleotide sequences for T.th. are as follows: dnaX (SEQ. ID. No. 3),dnaE (SEQ. ID. No. 86), dnaQ (SEQ. ID. No. 94), dnaN (SEQ. ID. No. 106),holA (SEQ. ID. No. 157), and holB (SEQ. ID. No. 155). Those nucleotidesequences for A.ae. are as follows: dnaX (SEQ. ID. No. 119), dnaE (SEQ.ID. No. 117), dnaQ (SEQ. ID. No. 127), dnaN (SEQ. ID. No. 121), holA(SEQ. ID. No. 123), anid holB (SEQ. ID. No. 125). Those nucleotidesequences for T.ma. are as follows: dnax (SEQ. ID. No. 141), dnaE (SEQ.ID. No. 137), dnaQ (SEQ. ID. No. 139), dnaN (SEQ. ID. No. 143), holA(SEQ. ID. No. 145), and holB (SEQ. ID. No. 147). Those nucleotidesequences for B.st. are as follows: dnaX (SEQ. ID. No. 181), polC (SEQ.ID. Nos. 183), dnaN (SEQ. ID. No. 173), holA (SEQ. ID. No. 177), andholB (SEQ. ID. No. 179).

[0027] The invention also provides methods and products for identifying,isolating and cloning DNA molecules which encode such accessory subunitsencoded by the recited genes of the DNA polymerase III-type enzymehereof.

[0028] Yet further, the invention extends to Polymerase III-type enzymesprepared by the purification of an extract taken from, e.g., theparticular thermophile under examination, treated with appropriatesolvents and then subjected to chromatographic separation on, e.g., ananion exchange column, followed by analysis of long chain syntheticability or Western analysis of the respective peaks against antibody toat least one of the anticipated enzyme subunits to confirm presence ofPol III, and thereafter, peptide sequencing of subunits that co purifyand amplification to obtain the putative gene and its encoded enzyme.

[0029] The present invention also relates to recombinant γ, τ, ε, α (aswell as PolC), δ, δ′ and β subunits and SSB from thermophiles. In theinstance of the γ and τ subunits of T.th., the invention includes thecharacterization of a frameshifting sequence that is internal to thegene and specifies relative abundance of the γ and τ gene products ofT.th. dnaX. From this characterization, expression of either one of thesubunits can be increased at the expense of the other (i.e. mutantframeshift could make all τ, simple recloning at the end of theframeshift could make exclusively γ and no τ).

[0030] In a further aspect of the present invention, DNA probes can beconstructed from the DNA sequences coding for, e.g., the T.th., A.ae.,T.ma., or B.st. dnaX, dnaQ, dnaE, dnaA, dnaN, holA, holB, and ssb genes,conserved variants and active fragments thereof, all as defined herein,and may be used to identify and isolate the corresponding genes codingfor the subunits of DNA polymerase III holoenzyme from otherthermophiles, such as those listed earlier herein. Accordingly, allchromosomal replicases (DNA Polymerase III-type) from thermophilicsources are contemplated and included herein.

[0031] The invention also extends to methods for identifying PolymeraseIII-type enzymes by use of the techniques of long-chain extension andelucidation of subunits with antibodies, as described% herein and withreference to the examples.

[0032] The invention further extends to the isolated and purified DNAPolymerase III from T.th., A.ae., T.ma., and B.st., the amino acidsequences of the γ, τ, ε, α (as well as PolC), δ, δ′, and β subunits andSSB, as set forth herein, and the nucleotide sequences of thecorresponding genes from T.th., A.ae., T.ma., or B.st. set forth herein,as well as to active fragments thereof, oligonucleotides and probesprepared or derived therefrom and the transformed cells that may belikewise prepared. Accordingly, the invention comprises the individualsubunits enumerated above and hereinafter, corresponding isolatedpolynucleotides and respective amino acid sequences for each of the γ,τ, ε, α (as well as PolC), δ, δ′, and β subunits and SSB, and toconserved variants, fragments, and the like, as well as to methods oftheir preparation and use in DNA amplification and sequencing. In aparticular embodiment, the invention extends to vectors for theexpression of the subunit genes of the present invention.

[0033] The invention also includes methods for the preparation of theDNA Polymerase III-type enzymes and the corresponding subunit genes ofthe present invention, and to the use of the enzymes and constructshaving active fragments thereof, in the preparation, reconstitution ormodification of like enzymes, as well as in amplification and sequencingof DNA by methods such as PCR, and like protocols, and to the DNAmolecules amplified and sequenced by such methods. In this regard, a PolIII-type enzyme that is reconstituted in the absence of ε, or using amutated ε with less 3′-5′ exonuclease activity, may be a superior enzymein either PCR or DNA sequencing applications, (e.g. Tabor et. al.,1995).

[0034] The invention is directed to methods for amplifying andsequencing a DNA molecule, particularly via the polymerase chainreaction (PCR), using the present DNA polymerase III-type enzymes orcomplexes. In particular, the invention extends to methods of amplifyingand sequencing of DNA using thermostable pol III-type enzyme complexesisolated from thermophilic bacteria such as Thermotoga and Thermusspecies, or recombinant thermostable enzymes. The invention alsoprovides amplified DNA molecules made by the methods of the invention,and kits for amplifying or sequencing a DNA molecule by the methods ofthe invention.

[0035] In this connection, the invention extends to methods foramplification of DNA that can achieve long chain extension of primedDNA, as by the application and use of Polymerase III-type enzymes of thepresent invention. An illustration of such methods is presented inExamples 15 and 16, infra.

[0036] Likewise, kits for amplification and sequencing of such DNAmolecules are included, which kits contain the enzymes of the presentinvention, including subunits thereof, together with other necessary ordesirable reagents and materials, and directions for use. The details ofthe practice of the invention as set forth above and later on herein,and with reference to the patents and literature cited herein, are allexpressly incorporated herein by reference and made a part hereof.

[0037] As stated, and in accordance with a principal object of thepresent invention, Polymerase III-type enzymes and their sub-units areprovided that are derived from thermophiles and that are adapted toparticipate in improved DNA amplification and sequencing techniques, andthe consequent ability to prepare larger DNA strands more rapidly andaccurately.

[0038] It is a further object of the present invention to provide DNAmolecules that are amplified and sequenced using the Polymerase III-typeenzymes hereof.

[0039] It is a still further object of the present invention to provideenzymes and corresponding methods for amplification and sequencing ofDNA that can be practiced without the participation of the clamp-loadingcomponent of the enzyme.

[0040] It is a still further object of the present invention to providekits and other assemblies of materials for the practice of the methodsof amplification and sequencing as aforesaid, that include and use theDNA polymerase III-type enzymes herein as part thereof.

[0041] One goal of this invention is to fully reconstitute the rapid andprocessive replicase from an extreme thermophilic eubacterium from fullyrecombinant protein subunits. One might think that the extreme heat inwhich these bacteria grow may have resulted in a completely differentsolution to the problem of chromosome replication. Prior to filing ofthe previously-identified priority applications, it is believed that PolIII had not been identified in any thermophile until the presentinventors found that Thermus thermophilus, which grows at a rather hightemperature of 70-80° C., would appear to contain a Pol III. Subsequentto this invention, the genome sequence of A. aeolicus was publishedwhich shows dnaE, dnaN, and dnaX genes. However, previous work did notfully reconstitute the working replication machinery from fullyrecombinant subunits. A holA gene and holB has not been identifiedpreviously in T. thermophilus or A. aeolicus, and studies in the E. colisystem show that delta and delta prime, encoded by holA and holB,respectively, are essential to loading the beta clamp onto DNA and,thus, is essential for rapid and processive holoenzyme function (U.S.Pat. Nos. 5,583,026 and 5,668,004 to O'Donnell, which are herebyincorporated by reference).

[0042] This invention fully reconstitutes a functional DNA polymeraseIII holoenzyme from the extreme thermophilos Thermus thermophilus andAquifex aeolicus. Aquifex aeolicus grows at an even higher temperaturethan Thermus thermophilus, up to 85° C. In this invention, the genes ofThermus thermophilus, Aquifex aeolicus, Thermotoga maritima, andBacillus stearothermophilus that are necessary to reconstitute thecomplete DNA polymerase III machinery, which acts as a rapid andprocessive polymerase, are identified. Indeed, a delta prime (holB) anddelta (holA) subunits are needed.

[0043] The dnaE, dnaN, dnaX, dnaQ, holA, and holB genes are used toexpress and purify the protein “gears”, and the proteins are used toreassemble the replication machine. The T.th. Pol III is similar to E.coli. The A.ae. Pol III is slightly dissimilar from the machinery ofpreviously studied replicases. The A.ae. dnaX gene encoded only oneprotein, tau, and in this fashion is similar to the dnaX of the grampositive organism, Staphylococcus aureus. In contrast, the dnaX of thegram negative cell, E. coli, produces two proteins. The Aquifex aeolicuspolymerase subunit, alpha (encoded by dnaE) does not contain thee 3′-5′proofreading exonuclease. In this regard, A. aeolicus, is similar to E.coli, but dissimilar to the replicase of the gram positive organisms. InGram positive organisms, the PolC polymerase subunit of the replicasecontains the exonuclease activity in the same polypeptide chain as thepolymerase (Low et al., 1976, Barnes et al., 1994; Pacitti et al.,1995). Further, the polymerase III of thermophilic bacteria retainsactivity at high temperature.

[0044] Thermostable rapid and processive three component DNA polymerasescan be applied to several important uses. DNA polymerases currently inuse for DNA sequencing and DNA amplification use enzymes that are muchslower and thus could be improved upon. This is especially true ofamplification as the three component polymerase is capable of speed andhigh processivity making possible amplification of very long (tens of Kbto Mb) lengths of DNA in a time-efficient manner. These three-componentpolymerases also function in conjunction with a replicative helicase(DnaB), and thus are capable of amplification at a single temperature,using the helicase to melt the DNA duplex. This property could be usefulin some methods of amplification, and in polymerase chain reaction (PCR)methodology. For example, the ατδδ′/β form of the E. coli DNA-polymeraseIII holoenzyme has been shown to function in both DNA sequencing and PCR(U.S. Pat. Nos. 5,583,026 and 5,668,004 to O'Donnell).

[0045] Other objects and advantages will become apparent from a reviewof the ensuing description which proceeds with reference to thefollowing illustrative drawings.

DESCRIPTION OF THE DRAWINGS

[0046]FIG. 1 is a schematic depiction of the structure and components ofenzymes of the general family to which the enzymes of the presentinvention belong.

[0047]FIG. 2 is an alignment of the N-terminal regions of E. coli (SEQ.ID. No. 19) and B. subtilis (SEQ. ID. No. 20) dnaX gene product.Asterisks indicate identities. The ATP binding consensus sequence isindicated. The two regions used for PCR primer design are shown in bold.

[0048]FIG. 3 is an image showing the Southern analysis of T.thermophilus genomic DNA. Genomic DNA was analyzed for presence of thednaZ gene using the PCR radiolabeled probe. Enzymes used for digestionare shown above each lane. The numbering to the right corresponds to thelength of DNA fragments (kb).

[0049]FIGS. 4A and 4B depict the full sequence of the dnaX gene of T.thermophilus. DNA sequence (upper case, and corresponding to SEQ IDNo. 1) and predicted amino acid sequence (lower case, and correspondingto SEQ ID No. 2) yields a 529 amino acid protein (τ) of 58.0 kDa. Aputative frameshifting sequence containing several A residues 1478-1486(underlined) may produce a smaller protein (γ) of 49.8 kDa. Thepotential Shine-Dalgarno (S.D.) signal is bold and underlined. The startcodon is in bold, and the stop codon for τ is marked by an asterisk. Thepotential stop codon for γ is shown in bold after the frameshift site,and two potential Shine-Dalgarno sequences upstream of the frameshiftsite are indicated. Sequences of the primers used for PCR are shown initalics above the nucleotide sequence of dnaX. The ATP binding site isindicated, and the asterisks above the four Cys residues near the ATPsite indicate the putative Zn²⁺ finger. The proline rich area isindicated above the sequence. Numbering of the nucleotide sequence ispresented to the right. Numbering of the amino acid sequence of τ isshown in parenthesis to the right.

[0050]FIG. 4C depicts the isolated DNA coding sequence for the dnaX gene(also present in FIGS. 3A and 3B) in accordance with the invention,which corresponds to SEQ. ID. No. 3.

[0051]FIG. 4D depicts the polypeptide sequence of the γ subunit of thePolymerase III of the present invention, which corresponds to SEQ. ID.No. 4.

[0052]FIG. 4E depicts the polypeptide-sequence of the γ subunit of thePolymerase III of the present invention defined by a −1 frameshift,which corresponds to SEQ. ID. No. 4.

[0053]FIG. 4F depicts the polypeptide sequence of the γ subunit of thePolymerase III of the present invention defined by a −2 frameshift,which corresponds to SEQ. ID. No. 5.

[0054] FIGS. 5A-B are alignments of the γ/τ ATP binding domains fordifferent bacteria. Dots indicate those residues that are identical tothe E. coli dnaX sequence. The ATP consensus site is underlined, and theconserved cysteine residues that form the zinc finger are indicated withasterisks. E. coli, Escherichia coli (SEQ. ID. No. 21); H. inf,Haemophilus influenzae (SEQ. ID. No. 22); B. sub., Bacillus subtilis(SEQ. ID. No. 23); C. cres., Caulobacter crescentus (SEQ. ID. No. 24);M. gen., Mycoplasma genitalium (SEQ. ID. No. 25); T.th., Thermusthermophilus (SEQ. ID. No. 26). Alignments were produced using Clustal.

[0055]FIG. 6 is a diagram indicating a signal for ribosomalframeshifting in T.th. dnaX. The diagram shows part of the sequence ofthe RNA (SEQ. ID. No. 27) around the frameshifting site (SEQ. ID. No.28), including the suspected slippery sequence A9 (bold italic). Thestop codon in the −2 reading frame is indicated. Also indicated arepotential step loop structures and the nearest stop codons in the −1reading frame.

[0056]FIG. 7 is an image showing a Western analysis of γ and τ in T.th.cells. Whole cells were lysed in SDS and electrophoresed on a 10% SDSpplyacrylamide gel then transferred to a membrane and probed withpolyclonal antibody against E. coli γ/τ as described in ExperimentalProcedures. Positions of molecular weight size markers are shown to theleft. Putative T.th. γ and τ are indicated to the right.

[0057] FIGS. 8A-B are images of E. coli colonies expressing T.th. dnaX−1 and −2 frameshifts. The region of the dnaX gene slippery sequence wascloned into the lacZ gene of pUC19 in three reading frames, thentransformed into E. coli cells and plated on LB plates containing X-gal.The slippery sequence was also mutated by inserting two G residues intothe A9 sequence and then cloned into pUC19 in all three reading flames.Color of colonies observed are indicated by the plus signs. The pictureshows the colonies, the type of frameshift required for readthrough(blue color) is indicted next to the sector.

[0058]FIG. 9 shows the construction of the T.th. γ/τ expression vector.A genomic fragment containing a partial sequence of dnaX was cloned intopALTER-1. This fragment was subcloned into pUC19 (pUC19_dnaX). Then theN-terminal section of dnaX was amplified such that the fragment wasflanked by NdeI (at the initiating codon) and the internal BamHI site.This fragment was inserted to form the entire coding sequence of thednaX gene in pUC19 (pUC19dnaX). The dnaX gene was then cloned behind thepolyhistidine leader in the T7 based expression vector pET16 to givepET16dnaX. Details are in “Experimental Procedures”.

[0059] FIGS. 10A-C illustrate the purification of recombinant T.th. γand τ subunits. T.th. γ and τ subunits were expressed in E. coliharboring pET16dnaX. Molecular size markers are shown to the left of thegels, and the two induced proteins are labeled as g and t to the rightof the gel. Panel A) 10% SDS gel of E. coli whole cell lysates beforeand after induction with IPTG. Panel B) 8% SDS gel of the purificationtwo steps after cell lysis. First lane: the lysate was applied to aHiTrap Nickel chromatography column. Second lane: the T.th. γ/τ subunitswere further purified on a Superose 12 gel filtration column. Thirdlane, the E. coli γ and τ subunits. Panel C) Western analysis of thepure T.th. γ and τ subunits (first lane) and E. coli γ and τ subunits(second lane).

[0060] FIGS. 11A-B show the gel filtration of T.th. γ and τ. T.th. γ andτ were gel filtered on a Superose 12 column. Column fractions wereanalyzed for ATPase activity and in a Coomassie Blue stained 10% SDSpolyacrylamide gel. Positions of molecular weight markers are shown tothe left of the gel. The elution position of size standards analyzed ina parallel Superose 12 column under identical conditions are indicatedabove the gel. Thyroglobin (670 kDa), bovine gamma globin (150 kDa),chicken ovalbumin (44 kDa), equine myoglobin (17 kDa).

[0061] FIGS. 12A-C illustrate the characterization of the T.th. γ and τATPase activity. The T.th. γ/τ and E. coil τ subunits are compared intheir ATPase activity characteristics. Due to the greater activity of E.coli τ, the values are plotted as percent for ease of comparison. Actualspecific activities for 100% values are given below as pmol ATPhydrolyzed/30 min./pmol T.th. γ/τ (or pmol E. coli τ). Panel A) T.th. γand τ ATPase is stimulated by the presence of ssDNA. T.th. γ/τ wasincubated at 65° C. Specific activity was: 11.5 (+DNA); 2.5 (−DNA); E.coli τ was assayed at 37° C. Specific activity values were: 112.5(+DNA); (7.3−DNA). Panel B) Temperature stability of DNA stimulatedATPase activity. T.th. γ/τ, 11.3 (65° C.); E. coli τ, 97.5 (37° C.).Panel C) Stability of T.th. γ/τ ATPase to NaCl. T.th. γ/τ, 8.1 (100 mMadded NaCl and 65° C.); E. coli τ, 52.7 (0 M added NaCl and 37° C.).

[0062] FIGS. 13A-13C are graphs that summarize the purification of theDNA polymerase III from T.th. extracts. Panel A) shows the activity andtotal protein in column fractions from the Heparin Agarose column. Peak1 fractions were chromatographed on ATP agarose. Panel B) depicts theATP-agarose column step, and Panel C) shows the total protein and DNApolymerase activity eluted from the MonoQ column.

[0063] FIGS. 14A-B are SDS polyacrylamide gels of T.th. subunits. FIG.14A is a 12% SDS polyacrylamide gel stained with Coomassie Blue of theMonoQ column. Load stands for the material loaded onto the column (ATPagarose bound fractions). FT stands for protein that flowed through theMonoQ column. Fractions are indicated above the gel. T.th. subunits infractions 17-19 are indicated by the labels placed between fractions 18and 19. Additional small subunits may be present but difficult tovisualize, or may have run off the gel. E. coli γ,δ shows a mixture ofthe α, γ, and δ subunits of DNA polymerase III holoenzyme (they arelabeled to the right in the figure). FIG. 14B shows the Western resultsof an SDS gel of the MonoQ fractions probed with rabbit antiserum raisedagainst the E. coli a subunit. Load and FT are as described in Panel A.Fraction numbers are shown above the gel. The band that comigrates withE. coli α, and the band in the Coomassie Blue stained gel in Panel A, ismarked with an arrow. This band was analyzed for microsequence and theresults are shown in FIG. 15.

[0064] FIGS. 15A-B show the alignments of the peptides obtained fromT.th. α subunit, TTH1 (shown in A) and TTH2 (shown in B) with the aminoacid sequences of the α subunits of other organisms. The amino acidnumber of these regions within each respective protein sequence areshown to the right. The abbreviations of the organisms are as follows.E.coli—Escherichia coli (SEQ ID NOS: 72 and 79 in 15A-B, respectively),V.chol.—Vibrio cholerae (SEQ ID NOS: 73 and 80 in 15A-B, respectively),H. inf—Haemophilus influenzae (SEQ ID NOS: 74 and 81 in 15A-B,respectively), R. prow.—Rickettsia prowazekii (SEQ ID NOS: 75 and 82 in15A-B, respectively), H. pyl.—Helicobacter pylori (SEQ ID NOS: 76 and 83in 15A-B, respectively), S.sp.—Synechocystis sp. (SEQ ID NOS: 77 and 84in 15A-B, respectively), M.tub.—Mycobacterium tuberculosis (SEQ ID NOS:78 and 85 in 15A-B, respectively), T.th.—Thermus thermophilus (SEQ IDNOS: 61 and 60 in 15A-B, respectively).

[0065] FIGS. 16A-C show a nucleotide (Panels A-B, SEQ. ID. No. 86) andamino acid (Panel C, SEQ. ID. No. 87) sequence of the dnaE gene encodingthe α subunit of DNA polymerase III replication enzyme.

[0066]FIG. 17 shows an alignment of the amino acid sequence of εsubunits encoded by dnaQ of several organisms. The amino acid sequenceof the Thermus thermophilus ε subunit of dnaQ is also shown. T.th.,Thermus thermophilus (SEQ. ID. No. 88); D.rad., Deinococcus radiodurans(SEQ. ID. No. 89); Bac.sub., Bacillus subtilis (SEQ. ID. No. 90); H.inf:, Haemophilus influenzae (SEQ. ID. No. 91); E.c., Escherichia coli(SEQ. ID. No. 92); H.pyl., Helicobacter pylori (SEQ. ID. No. 93). Theregions used to obtain the inner part of the dnaQ gene are shown inbold. The starts used for expression of the T.th. ε subunit are marked.

[0067] FIGS. 18A-B show the nucleotide (Panel A, SEQ. ID. No. 94) andamino acid (Panel B, SEQ. ID. No. 95) sequence of the dnaQ gene encodingthe ε subunit of DNA polymerase III replication enzyme.

[0068] FIGS. 19A-B show an alignment of the DnaA protein of severalorganisms. The amino acid sequence of the Thermus thermophilus DnaAprotein is also shown. P.mar., Pseudomonas marcesans (SEQ. ID. No. 96);Syn.sp., Synechocystis sp. (SEQ. ID. No. 97); Bac.sub., Bacillussubtilis (SEQ. ID. No. 98); M. tub; Mycobacterium tuberculosis (SEQ. ID.No. 99); T.th., Thermus thermophilus (SEQ. ID. No. 100); E.coli.,Escherichia coli (SEQ. ID. No. 101); T mar., Thermatoga maritima (SEQ.ID. No. 102); and H.pyl., Helicobacter pylori (SEQ. ID. No. 103).

[0069] FIGS. 20A-B show the nucleotide (Panel A, SEQ. ID. No. 104) andamino acid (Panel B, SEQ. ID. No. 105) sequence of the dnaA gene ofThermus thermophilus.

[0070] FIGS. 21A-B show the nucleotide (Panel A, SEQ. ID. No. 106) andamino acid (Panel B, SEQ. ID. No. 107) sequence of the dnaN geneencoding the β subunit of DNA polymerase III replication enzyme.

[0071] FIGS. 22A-B show an alignment of the β subunit of T.th. to the βsubunits of other organisms. T.th., Thermus thermophilus (SEQ. ID. No.108); E. coli, Escherichia coli (SEQ. ID. No. 109); P. mirab, Proteusmirabilis (SEQ. ID. No. 110); H. infl, Haemophilus influenzae (SEQ. ID.No. 111); P. put., Pseudomonas putida (SEQ. ID. No. 112); and B. cap.,Buchnera aphidicola (SEQ. ID. No. 113).

[0072]FIG. 23 is a map of the pET24:dnaN plasmid. The functional regionsof the plasmid are indicated by arrows and italic, restriction sites aremarked with bars and symbols. The hatched parts in the plasmidcorrespond to T.th. dnaN.

[0073] FIGS. 24A-B show the induction of T.th. β in E. coli cellsharboring the T.th. β expression vector. Panel A is the cell induction.The first lane shows molecular weight markers (MW). The second laneshows uninduced E. coli cells, and the third lane shows induced E. coli.The induced T.th. β is indicated by the arrow shown to the left. Inducedcells were lysed then treated with heat and the soluble portion waschromatographed on MonoQ. Panel B shows the results of MonoQpurification of T.th. β.

[0074]FIG. 25A is a schematic depiction of the use of the use of theenzymes of the present invention in accordance with an alternateembodiment hereof. In this scheme the clamp (β or PCNA) slides over theend of linear DNA to enhance the polymerase (Pol III-type such as PolIII, PolB or PolB.) In this fashion the clamp loader activity is notneeded.

[0075]FIG. 25B graphically demonstrates the results of the practice ofthe alternate embodiment of the invention described and set forth inExample 15, infra. Lane 1, E. coli Pol III without B; Lane 2, E. coliwith B; Lane 3, human Polδ without PCNA; Lane 4, human Polδ with PCNA;Lane 5, T.th. Pol III without T.th. β; Lane 6, T.th. Pol III with T.th.β. The respective pmol synthesis in lanes 1-6 are: 6, 35, 2, 24, 0.6and1.9.

[0076] FIGS. 26A-B show the use of T.th. Pol III in extending singlyprimed M13mp18 to an RFII form. The scheme in FIG. 26A shows the primedtemplate in which a DNA 57 mer was annealled to the M13mp18 ssDNAcircle. Then T.th. β subunit (produced recombinantly) and T.th. Pol IIIwere added to the DNA in the presence of radioactive nucleosidetriphosphates. In FIG. 26B, the products of the reaction were analyzedin a 0.8% native agarose gel. The position of ssDNA starting material,the RFII product, and of intermediate species, are shown to the sides ofthe gel. Lane 1, use of Pol III. Lane 2, use of the non-Pol III DNApolymerase.

[0077]FIG. 27 is an SDS polyacrylamide gel of the proteins of the A.aeolicus replication machinery.

[0078]FIG. 28 is an SDS polyacrylamide gel analysis of the MonoQfractions of the method used to reconstitute and purify the A. aeolicusτδδ′ complex.

[0079]FIG. 29 is an SDS polyacrylamide gel analysis of the gelfiltration column fractions used in the preparation of the A. aeolicusατδδ′ complex. The bottom gel analysis shows the profile obtained usingthe A. aeolicus a subunit (polymerase) in the absence of the othersubunits.

[0080]FIG. 30 is an alkaline agarose gel analysis of reaction productsfor extension of a single-primer around a 7.2 kb M13mp18 circular ssDNAgenome that has been coated with A. aeolicus SSB. The time course on theleft are produced by ατδδ′/β, and the time course on the right isproduced by ατδδ′ in the absence of β.

[0081]FIG. 31 is, a graph illustrating the optimal temperature foractivity of the alpha subunit of Thermus replicase using a calf thymusDNA replication assay. Reactions were shifted to the indicatedtemperature for 5 minutes before detecting the level of DNA synthesisactivity.

[0082]FIG. 32 is a graph illustrating the optimal temperature foractivity of the alpha subunit of the Aquifex replicase using a calfthymus DNA replication assay. Reactions were shifted to the indicatedtemperature for 5 minutes before detecting the level of DNA synthesisactivity.

[0083] FIGS. 33A-E illustrate the heat stability of Aquifex components.Assays of either a (FIG. 33A), β (FIG. 33B), τδδ′ complex (FIG. 33C),SSB (FIG. 33D) and ατδδ′ complex (FIG. 33E) were performed after heatingsamples at the indicated temperatures. Components were heated in buffercontaining the following: 0.1% Triton X-100 (filled diamonds); 0.05%Tween-20 and 0.01% NP-40 (filled circles); 4 mM CaCl₂ (filled trangles);40% Glycerol (inverted filled triangles); 0.01% Triton X100, 0.05%Tween-20, 0.01% NP-40, 4 mM CaCl₂ (half-filled square); 40% Glycerol,0.1% Triton X-100 (open-diamonds); 40% Glycerol, 0.05% Tween-20, 0.01%NP-40 (open circles); 40% Glycerol, 4 mM CaCl₂ (open triangles);,40%Glycerol, 0.01% Triton X-100, 0.05% Tween-20, 0.01% NP-40, 4 mM CaCl₂(half-filled diamonds).

[0084] FIGS. 34A-B show the nucleotide sequence (SEQ. ID. No. 117) ofthe dnae gene of A. aeolicus.

[0085]FIG. 35 shows the amino acid sequence (SEQ. ID. No. 118) of the αsubunit of A. aeolicus.

[0086]FIG. 36 shows the nucleotide sequence (SEQ. ID. No. 119) of thednaX gene of A. aeolicus.

[0087]FIG. 37 shows the amino acid sequence (SEQ. ID. No. 120) of thetau subunit of A. aeolicus.

[0088]FIG. 38 shows the nucleotide sequence (SEQ. ID. No. 121) of thednaN gene of A. aeolicus.

[0089]FIG. 39 shows the amino acid sequence (SEQ. ID. No. 122) of the βsubunit of A. aeolicus.

[0090]FIG. 40 shows the partial nucleotide sequence (SEQ. ID. No. 123)of the holA gene of A. aeolicus.

[0091]FIG. 41 shows the partial amino acide sequence (SEQ. ID. No. 124)of the δ subunit of A. aeolicus.

[0092]FIG. 42 shows the nucleotide sequence (SEQ. ID. No. 125) of theholB gene of A. aeolicus.

[0093]FIG. 43 shows the amino acid sequence (SEQ. ID. No. 126) of the δ′subunit of A. aeolicus.

[0094]FIG. 44 shows the nucleotide sequence (SEQ. ID. No. 127) of thednaQ of A. aeolicus.

[0095]FIG. 45 shows the amino acid sequence (SEQ. ID. No. 128) of the εsubunit of A. aeolicus.

[0096]FIG. 46 shows the nucleotide sequence (SEQ. ID. No. 129) of thessb gene of A. aeolicus.

[0097]FIG. 47 shows the amino acid sequence (SEQ. ID. No. 130) of thesingle-strand binding protein of A. aeolicus.

[0098]FIG. 48 shows the nucleotide sequence (SEQ. ID. No. 131) of thednaB gene of A. aeolicus.

[0099]FIG. 49 shows the amino acid sequence (SEQ. ID. No. 132) of theDnaB helicase of A. aeolicus.

[0100]FIG. 50 shows the nucleotide sequence (SEQ. ID. No. 133) of thednaG gene of A. aeolicus.

[0101]FIG. 51 shows the amino acid sequence (SEQ. ID. No. 134) of theDnaG primase of A. aeolicus.

[0102]FIG. 52 shows the nucleotide sequence (SEQ. ID. No. 135) of thednaC gene of A. aeolicus.

[0103]FIG. 53 shows the amino acid sequence (SEQ. ID. No. 136) of theDnaC protein of A. aeolicus.

[0104] FIGS. 54A-B shows the nucleotide sequence (SEQ. ID. No. 137) ofthe dnaE gene of T. maritima.

[0105]FIG. 55 shows the amino acid sequence (SEQ. ID. No. 138) of the αsubunit of T. maritima.

[0106]FIG. 56 shows the nucleotide sequence (SEQ. ID. No. 139) of thednaQ gene of T. maritima.

[0107]FIG. 57 shows the amino acid sequence (SEQ. ID. No. 140) of the εsubunit of T. maritima.

[0108]FIG. 58 shows the nucleotide sequence (SEQ. ID. No. 141) of thednaX gene of T. maritima.

[0109]FIG. 59 shows the amino acid sequence (SEQ. ID. No. 142) of thetau subunit of T. maritima.

[0110]FIG. 60 shows the nucleotide sequence (SEQ. ID. No. 143) of thednaN gene of T. maritima.

[0111]FIG. 61 shows the amino acid sequence (SEQ. ID. No. 144) of the βsubunit of T. maritima.

[0112]FIG. 62 shows the nucleotide sequence (SEQ. ID. No. 145) of theholA gene of T. maritima.

[0113]FIG. 63 shows the amino acid sequence (SEQ. ID. No. 146) of the δsubunit of T. maritima.

[0114]FIG. 64 shows the nucleotide sequence (SEQ. ID. No. 147) of theholB gene of T. maritima.

[0115]FIG. 65 shows the amino acid sequence (SEQ. ID. No. 148) of the δ′subunit of T. maritima.

[0116]FIG. 66 shows the nucleotide sequence (SEQ. ID. No. 149) of thessb gene of T. maritima.

[0117]FIG. 67 shows the amino acid sequence (SEQ. ID. No. 150) of thesingle-strand binding protein of T. maritima.

[0118]FIG. 68 shows the nucleotide sequence (SEQ. ID. No. 151) of thednaB gene of T. maritima.

[0119]FIG. 69 shows the amino acid sequence (SEQ. ID. No. 152) of theDnaB helicase of T. maritima.

[0120]FIG. 70 shows the nucleotide sequence (SEQ. ID. No. 153) of thednaG gene of T. maritima.

[0121]FIG. 71 shows the amino acid sequence (SEQ. ID. No. 154) of theDnaG primase of T. maritima.

[0122]FIG. 72 shows the nucleotide sequence (SEQ. ID. No. 155) of theholB gene of T. thermophilus.

[0123]FIG. 73 shows the amino acid sequence (SEQ. ID. No. 156) of the δ′subunit of T. thermophilus.

[0124]FIG. 74 shows the nucleotide sequence (SEQ. ID. No. 157) of theholA gene of T. thermophilus.

[0125]FIG. 75 shows the amino acid sequence (SEQ. ID. No. 158) of the δsubunit of T. thermophilus.

[0126]FIG. 76 shows the nucleotide sequence (SEQ. ID. No. 171) of thessb gene of T. thermophilus.

[0127]FIG. 77 shows the amino acid sequence (SEQ. ID. No. 172) of thesingle-strand binding protein of T. thermophilus.

[0128]FIG. 78 shows the partial nucleotide sequence (SEQ. ID. No. 173)of the dnaN gene of B. stearothermophilus.

[0129]FIG. 79 shows the partial amino acid sequence (SEQ. ID. No. 174)of the β subunit of B. stearothermophilus.

[0130]FIG. 80 shows the nucleotide sequence (SEQ. ID. No. 175) of thessb gene of B. stearothermophilus.

[0131]FIG. 81 shows the-amino acid sequence (SEQ. ID. No. 176) of thesingle-strand binding protein of B. stearothermophilus.

[0132]FIG. 82 shows the nucleotide sequence (SEQ. ID. No. 177) of theholA gene of B. stearothermophilus.

[0133]FIG. 83 shows the amino acid sequence (SEQ. ID. No. 178) of the δsubunit of B. stearothermophilus.

[0134]FIG. 84 shows the nucleotide sequence (SEQ. ID. No. 179) of theholB gene of B. stearothermophilus.

[0135]FIG. 85 shows the amino acid sequence (SEQ. ID. No. 180) of the δ′subunit of B. stearothermophilus.

[0136] FIGS. 86A-B show the partial nucleotide sequence (SEQ. ID. No.181) of the dnaX gene of B. stearothermophilus.

[0137]FIG. 87 shows the partial amino acid sequence (SEQ. ID. No. 182)of the tau subunit of B. stearothermophilus.

[0138] FIGS. 88A-B show the nucleotide sequence (SEQ. ID. No. 183) ofthe polC gene of B. stearothermophilus.

[0139]FIG. 89 shows the amino acid sequence (SEQ. ID. No. 184) of thePolC or α-large subunit of B. stearothermophilus.

DETAILED DESCRIPTION OF THE INVENTION

[0140] In accordance with the present invention there may be employedconventional molecular biology, microbiology, and recombinant DNAtechniques within the skill of the art. Such techniques are explainedfully in the literature. See, e.g., Sambrook et al., “Molecular Cloning:A Laboratory Manual” (1989); “Current Protocols in Molecular Biology”Volumes I-III (Ausubel, R. M., ed.) (1994); “Cell Biology: A LaboratoryHandbook” Volumes I-III (Celis, J. E., ed.) (1994); “Current Protocolsin Immunology” Volumes I-III (Coligan, J. E., ed.) (1994);“Oligonucleotide Synthesis” (M. J. Gait, ed.) (1984); “Nucleic AcidHybridization” (B. D. Hames & S. J. Higgins, eds.) (1985);“Transcription And Translation” (B. D. Hames & S. J. Higgins, eds.)(1984); “Animal Cell Culture” (R. I. Freshney, ed.) (1986); “ImmobilizedCells And Enzymes” (IRL Press) (1986); B. Perbal, “A Practical Guide ToMolecular Cloning” (1984), each of which is hereby incorporated byreference.

[0141] Therefore, if appearing herein, the following terms shall havethe definitions set out below.

[0142] The terms “DNA Polymerase III,” “Polymerase III-type enzyme(s)”,“Polymerase III enzyme complex(s)”, “T.th. DNA Polymerase III”, “A. ae.DNA Polyrerase III”, “T.ma. DNA Polymerase III”, and any variants notspecifically listed, may be used herein interchangeably, as are subunitand sliding clamp and clamp as are also γ complex, clamp loader, andRFC, as used throughout the present application and claims refer toproteinaceous material including single or multiple proteins, andextends to those proteins having the amino acid sequence data describedherein and presented in the Figures and corresponding Sequence Listingentries, and the corresponding profile of activities set forth hereinand in the Claims. Accordingly, proteins displaying substantiallyequivalent or altered activity are likewise contemplated. Thesemodifications may be deliberate, for example, such as modificationsobtained through site-directed mutagenesis, or may be accidental, suchas those obtained through mutations in hosts that are producers of thecomplex or its named subunits. Also, the terms “DNA Polymerase III,”“T.th. DNA Polymerase III,” and “γ and τ subunits”, “β subunit”, “αsubunit”, “ε subunit”, “δ subunit”, “δ′ subunit”, “SSB protein”,“sliding clamp” and “clamp loader” are intended to include within theirscope proteins specifically recited herein as well as all substantiallyhomologous analogs and allelic variations. As used herein γ complexrefers to a particular type of clamp loader that includes a γ subunit.

[0143] Also as used herein, the term “thermolabile enzyme” refers to aDNA polymerase which is not resistant to inactivation by heat. Forexample, T5 DNA polymerase, the activity of which is totally inactivatedby exposing the enzyme to a temperature of 90° C. for 30 seconds, isconsidered to be a thermolabile DNA polymerase. As used herein, athermolabile DNA polymerase is less resistant to heat inactivation thanin a thermostable DNA polymerase. A thermolabile DNA polymerasetypically will also have a lower optimum temperature than a thermostableDNA polymerase. Thermolabile DNA polymerases are typically isolated frommesophilic organisms, for example mesophilic bacteria or eukaryotes,including certain animals.

[0144] As used herein, the term “thermostable enzyme” refers to anenzyme which is stable to heat and is heat resistant and catalyzes(facilitates) combination of the nucleotides in the proper manner toform the primer extension products that are complementary to eachnucleic acid strand. Generally, the synthesis will be initiated at the3′ end of each primer and will proceed in the 5′ direction along thetemplate strand, until synthesis terminates, producing molecules ofdifferent lengths.

[0145] The thermostable enzyme herein must satisfy a single criterion tobe effective for the amplification reaction, i.e., the enzyme must notbecome irreversibly denatured (inactivated) when subjected to theelevated temperatures for the time necessary to effect denaturation ofdouble-stranded nucleic acids. Irreversible denaturation for purposesherein refers to permanent and complete loss of enzymatic activity. Theheating conditions necessary for denaturation will depend, e.g., on thebuffer salt concentration and the length and nucleotide composition ofthe nucleic acids being denatured, but typically range from about 90° C.to about 96° C. for a time depending mainly on the temperature and thenucleic acid length, typically about 0.5 to four minutes. Highertemperatures may be tolerated as the buffer salt concentration and/or GCcomposition-of the nucleic acid is increased. Preferably, the enzymewill not become irreversibly denatured at about 90-100° C.

[0146] The thermostable enzymes herein preferably have an optimumtemperature at which they function that is higher than about 40° C.,which is the temperature below which hybridization of primer to templateis promoted, although, depending on (1) magnesium and saltconcentrations and (2) composition and length of primer, hybridizationcan occur at higher temperature (e.g., 45-70° C.). The higher thetemperature optimum for the enzyme, the greater the specificity and/orselectivity of the primer-directed extension process. However, enzymesthat are active-below 40° C., e.g., at 37° C., are also within the scopeof this invention provided they are heat-stable. Preferably, the optimumtemperature ranges from about 50° to about 90° C., more preferably about60° to about 80° C. In this connection, the term “elevated temperature”as used herein is intended to cover sustained temperatures of operationof the enzyme that are equal to or higher than about 60° C.

[0147] The term “template” as used herein refers to a double-stranded orsingle-stranded DNA molecule which is to be amplified, synthesized, orsequenced. In the case of a double-stranded DNA molecule, denaturationof its strands to form a first and a second strand is performed beforethese molecules may be amplified, synthesized or sequenced. A primer,complementary to a portion of a DNA template is hybridized underappropriate conditions and the DNA polymerase of the invention may thensynthesize a DNA molecule complementary to said template or a portionthereof. The newly synthesized DNA molecule, according to the invention,may be equal or shorter in length than the original DNA template.Mismatch incorporation during the synthesis or extension of the newlysynthesized DNA molecule may result in one or a number of mismatchedbase pairs. Thus, the synthesized DNA-molecule need not be exactlycomplementary to the DNA template.

[0148] The term “incorporating” as used herein means becoming a part ofa DNA molecule or primer.

[0149] As used herein “amplification” refers to any in vitro method forincreasing the number of copies of a nucleotide sequence, or itscomplimentary sequence, with the use of a DNA polymerase. Nucleic acidamplification results in the incorporation of nucleotides into a DNAmolecule or primer thereby forming a new DNA molecule complementary to aDNA template. The formed DNA molecule and its template can be used astemplates to synthesize additional DNA molecules. As used herein, oneamplification reaction may consist of many rounds of DNA replication.DNA amplification reactions include, for example, polymerase chainreactions (PCR). One PCR reaction may consist of about 20 to 100“cycles” of denaturation and synthesis of a DNA molecule. In thisconnection, the use of the term “long stretches of DNA” as it refers tothe extension of primer along DNA is intended to cover such extensionsof an average length exceeding 7 kilobases. Naturally, such length willvary, and all such variations are considered to be included within thescope of the invention.

[0150] As used herein, the term “holoenzyme” refers to a multi-subunitDNA polymerase activity comprising and resulting from various subunitswhich each may have distinct activities but which when contained in anenzyme reaction operate to carry out the function of the polymerase(typically DNA synthesis) and enhance its activity over use of the DNApolymerase subunit alone. For example, E. coli DNA polymerase III is aholoenzyme comprising three components of one or more subunits each: (1)a core component consisting of a heterotrimer of α, ε and θ subunits;(2) a β component consisting of a β subunit dimer; and (3) a γ complexcomponent consisting of a heteropentamer of γ, δ, δ′, χ and ψ subunits(see Studwell and O'Donnell, 1990). These three components, and thevarious subunits of which they consist, are linked non-covalently toform the DNA polymerase III holoenzyme complex. However, they alsofunction when not linked in solution.

[0151] As used herein, “enzyme complex” refers to a protein structureconsisting essentially of two or more subunits of a replication enzyme,which may or may not be identical, noncovalently linked to each other toform a multi-subunit structure. An enzyme complex according to thisdefinition ideally will have a particular enzymatic activity, up to andincluding the activity of the replication enzyme. For example, a “DNApol III enzyme complex” as used herein means a multi-subunit proteinactivity comprising two or more of the subunits of the DNA pol IIIreplication enzyme as defined above, and having DNA polymerizing orsynthesizing activity. Thus, this term encompasses the nativereplication enzyme, as well as an enzyme complex lacking one or more ofthe subunits of the replication enzyme (e.g., DNA pol III exo-, whichlacks the ε subunit).

[0152] The amino acid residues described herein are preferred to be inthe “L” isomeric form. However, residues in the “D” isomeric form can besubstituted for any L-amino acid residue, as long as the desiredfunctional property of immunoglobulin-binding is retained by thepolypeptide. NH₂ refers to the free amino group present at the aminoterminus of a polypeptide. COOH refers to the free carboxy group presentat the carboxy terminus of a polypeptide. In keeping with standardpolypeptide nomenclature, J. Biol. Chem., 243:3552-59 (1969),abbreviations for amino acid residues are shown in the following Tableof Correspondence: TABLE OF CORRESPONDENCE SYMBOLS 1-Letter 3-LetterAMINO ACID Y Tyr tyrosine G Gly glycine F Phe phenylalanine M Metmethionine A Ala alanine S Ser serine I Ile isoleucine L Leu leucine TThr threonine V Val valine P Pro proline K Lys lysine H His histidine QGln glutamine E Glu glutamic acid W Trp tryptophan R Arg arginine D Aspaspartic acid N Asn asparagine C Cys cysteine

[0153] It should be noted that all amino-acid residue sequences arerepresented herein by formulae whose left and right orientation is inthe conventional direction of amino-terminus to carboxy-terminus.Furthermore, it should be noted that a dash at the beginning or end ofan amino acid residue sequence indicates a peptide bond to a furthersequence of one or more amino-acid residues. The above Table ispresented to correlate the three-letter and one-letter notations whichmay appear alternately herein.

[0154] A “replicon” is any genetic element (e.g., plasmid, chromosome,virus) that functions as an autonomous unit of DNA replication in vivo;i.e., capable of replication under its own control.

[0155] A “vector” is a replicon, such as plasmid, phage or cosmid, towhich another DNA segment may be attached so as to bring about thereplication of the attached segment.

[0156] A “DNA molecule” refers to the polymeric form ofdeoxyribonucleotides (adenine, guanine, thymine, or cytosine) in itseither single stranded form, or a double-stranded helix. This termrefers only to the primary and secondary structure of the molecule, anddoes not limit it to any particular tertiary forms. Thus, this termincludes double-stranded DNA found, inter alia; in linear DNA molecules(e.g., restriction fragments), viruses, plasmids, and chromosomes. Indiscussing the structure of particular double-stranded DNA molecules,sequences may be described herein according to the normal convention ofgiving only the sequence in the 5′ to 3′ direction along thenontranscribed strand of DNA (i.e., the strand having a sequencehomologous to the mRNA).

[0157] An “origin of replication” refers to those DNA sequences thatparticipate in DNA synthesis.

[0158] A DNA “coding sequence” is a double-stranded DNA sequence whichis transcribed and translated into a polypeptide in vivo when placedunder the control of appropriate regulatory sequences. The boundaries ofthe coding sequence are determined by a start codon at the 5′ (amino)terminus and a translation stop codon at the 3′ (carboxyl) terminus. Acoding sequence can include, but is not limited to, prokaryoticsequences, cDNA from eukaryotic mRNA, genomic DNA sequences fromeukaryotic (e.g., mammalian) DNA, and even: synthetic DNA sequences. Apolyadenylation signal and transcription termination sequence willusually be located 3′ to the coding sequence.

[0159] Transcriptional and translational control sequences are DNAregulatory sequences, such as promoters, enhancers, polyadenylationsignals, terminators, and the like, that provide for the expression of acoding sequence in a host cell.

[0160] A “promoter sequence” is a DNA regulatory region capable ofbinding RNA polymerase in a cell and initiating transcription of adownstream (3′ direction) coding sequence. For purposes of defining thepresent invention, the promoter sequence is bounded at its 3′ terminusby the transcription initiation site and extends upstream (5′ direction)to include the minimum number of bases or elements necessary to initiatetranscription at levels detectable above background. Within the promotersequence will be found a transcription initiation site (convenientlydefined by mapping with nuclease S1), as well as protein binding domains(consensus sequences) responsible for the binding of RNA polymerase.Eukaryotic promoters will often, but not always, contain “TATA” boxesand “CAT” boxes. Prokaryotic promoters contain Shine-Dalgarno sequencesin addition to the −10 and −35 consensus sequences.

[0161] An “expression control sequence” is a DNA sequence that controlsand regulates the transcription and translation of another DNA sequence.A coding sequence is “under the control” of transcriptional andtranslational control sequences in a cell when RNA polymerasetranscribes the coding sequence into mRNA, which is then translated intothe protein encoded by the coding sequence.

[0162] A “signal sequence” can be included before the coding sequence.This sequence encodes a signal peptide, N-terminal to the polypeptide,that communicates to the host cell to direct the polypeptide to the cellsurface or secrete the polypeptide into the media, and this signalpeptide is clipped off by the host cell before the protein leaves thecell. Signal sequences can be found associated with a variety ofproteins native to prokaryotes and eukaryotes.

[0163] The term “oligonucleotide,” as used generally herein, such as inreferring to probes prepared and used in the present invention, isdefined as a molecule comprised of two or more (deoxy)ribonucleotides,preferably more than three. Its exact size will depend upon many factorswhich, in turn, depend upon the ultimate function and use of theoligonucleotide.

[0164] The term “primer” as used herein refers to an oligonucleotide,whether occurring naturally as in a purified restriction digest orproduced synthetically, which is capable of acting as a point ofinitiation of synthesis when placed under conditions in which synthesisof a primer extension product, which is complementary to a nucleic acidstrand, is induced, i.e., in the presence of nucleotides and an inducingagent such as a DNA polymerase and at a suitable temperature and pH. Theprimer may be either single-stranded or double-stranded and must besufficiently long to prime the synthesis of the desired extensionproduct in the presence of the inducing agent. The exact length of theprimer will depend upon many factors, including temperature, source ofprimer and use of the method. For example, for diagnostic applications,depending on the complexity of the target sequence, the oligonucleotideprimer typically contains 15-25 or more nucleotides, although it maycontain fewer nucleotides.

[0165] The primers herein are selected to be “substantially”complementary to different strands of a particular target DNA sequence.This means that the primers must be sufficiently complementary tohybridize with their respective strands. Therefore, the primer sequenceneed not reflect the exact sequence of the template. For example, anon-complementary nucleotide fragment may be attached to the 5′ end ofthe primer, with the remainder of the primer sequence beingcomplementary to the strand. Alternatively, non-complementary bases orlonger sequences can be interspersed into the primer, provided that theprimer sequence has sufficient complementarity with the sequence of thestrand to hybridize therewith and thereby form the template for thesynthesis of the extension product.

[0166] As used herein, the terms “restriction endonucleases” and“restriction enzymes” refer to bacterial enzymes, each of which cutdouble-stranded DNA at or near a specific, nucleotide sequence.

[0167] A cell has been “transformed” by exogenous or heterologous DNAwhen such DNA has been introduced inside the cell. The transforming DNAmay or may not be integrated (covalently linked) into chromosomal DNAmaking up the genome of the cell. In prokaryotes, yeast, and mammaliancells for example, the transforming DNA may be maintained on an episomalelement such as a plasmid. With respect to eukaryotic cells, a stablytransformed cell is one in which the transforming DNA has becomeintegrated into a chromosome so that it is inherited by daughter cellsthrough chromosome replication. This stability is demonstrated by theability of the eukaryotic cell to establish cell lines or clonescomprised of a population of daughter cells containing the transformingDNA. A “clone” is a population of cells derived from a single cell orcommon ancestor by mitosis. A “cell line” is a clone of a primary cellthat is capable of stable growth in vitro for many generations.

[0168] Two DNA sequences are “substantially homologous” when at leastabout 75% (preferably at least about 80%, and most preferably at leastabout 90 or 95%) of the nucleotides match over the defined length of theDNA sequences. Sequences that are substantially homologous can beidentified by comparing the sequences using standard software availablein sequence data banks, or in a Southern hybridization experiment under,for example, stringent conditions as defined for that particular system.Suitable conditions include those characterized by a hybridizationbuffer comprising 0.9M sodium citrate (“SSC”) buffer at a temperature ofabout 37° C. and washing in SSC buffer at a temperature of about 37° C.;and preferably in a hybridization buffer comprising 20% formamide in0.9M SSC buffer at a temperature of about 42° C. and washing with0.2×SSC buffer at about 42° C. Stringency conditions can be furthervaried by modifying the temperature and/or salt content of the buffer,or by modifying the length of the hybridization probe as is known tothose of skill in the art. Defining appropriate hybridization conditionsis within the skill of the art. See, e.g., Maniatis et al., 1982;Glover, 1985; Hames and Higgins, 1984.

[0169] It should be appreciated that also within the scope of thepresent invention are degenerate DNA sequences. By “degenerate” is meantthat a different three-letter codon is used to specify a particularamino acid. It is well known in the art that the following codons can beused interchangeably to code for each specific amino acid: Phenylalanine(Phe or F) UUU or UUC Leucine (Leu or L) UUA or UUG or CUU or CUC or CUAor CUG Isoleucine (Ile or I) AUU or AUG or AUA Methionine (Met or M) AUGValine (Val or V) GUU or GUC of GUA or GUG Serine (Ser or S) UCU or UCCor UCA or UCG or AGU or AGC Proline (Pro or P) CCU or CCC or CCA or CCGThreonine (Thr or T) ACU or ACC or ACA or ACG Alanine (Ala or A) GCU orGCG or GCA or GCG Tyrosine (Tyr or Y) UAU or UAC Histidine (His or H)CAU or CAC Glutamine (Gln or Q) CAA or CAG Asparagine (Asn or N) AAU orAAC Lysine (Lys or K) AAA or AAG Aspartic Acid (Asp or D) GAU or GACGlutamic Acid (Glu or E) GAA or GAG Cysteine (Cys or C) UGU or UGCArginine (Arg or R) CGU or CGC or CGA or CGG or AGA or AGG Glycine (Glyor G) GGU or GGC or GGA or GGG Tryptophan (Trp or W) UGG Terminationcodon UAA (ochre) or UAG (amber) or UGA (opal)

[0170] It should be understood that the codons specified above are forRNA sequences. The corresponding codons for DNA have a T substituted forU.

[0171] Mutations can be made, e.g., in SEQ. ID. No. 1, or any of thenucleic acids set forth herein, such that a particular codon is changedto a codon which codes for a different amino acid. Such a mutation isgenerally made by making the fewest nucleotide changes possible. Asubstitution mutation of this sort can be made to change an amino acidin the resulting protein in a non-conservative manner (i.e., by changingthe codon from an amino acid belonging to a grouping of amino acidshaving a particular size or characteristic to an amino acid belonging toanother grouping) or in a conservative manner (i.e;, by changing thecodon from an amino acid belonging to a grouping of amino acids having aparticular size or characteristic to an amino acid belonging to the samegrouping). Such a conservative change generally leads to less change inthe structure and function of the resulting protein. A non-conservativechange is more likely to alter the structure, activity or function ofthe resulting protein. The present invention should be considered toinclude sequences containing conservative changes which do notsignificantly alter the activity or binding characteristics of theresulting protein.

[0172] The following is one example of-various groupings of amino acids:

[0173] Amino Acids with Nonpolar R Groups

[0174] Alanine

[0175] Valine

[0176] Leucine

[0177] Isoleucine

[0178] Proline

[0179] Phenylalanine

[0180] Tryptophan

[0181] Methionine

[0182] Amino Acids with Uncharged Polar R Groups

[0183] Glycine

[0184] Serine

[0185] Threonine

[0186] Cysteine

[0187] Tyrosine

[0188] Asparagine

[0189] Glutamine

[0190] Amino Acids with Charged Polar R Groups (Negatively Charged at pH6.0)

[0191] Aspartic acid

[0192] Glutamic acid

[0193] Basic Amino Acids (Positively Charged at pH 6.0)

[0194] Lysine

[0195] Arginine

[0196] Histidine (at. pH 6.0).

[0197] Amino Acids with Phenyl Groups:

[0198] Phenylalanine

[0199] Tryptophan

[0200] Tyrosine

[0201] Another grouping may be according to molecular weight (i.e., sizeof R groups): Glycine 75 Alanine 89 Serine 105 Proline 115 Valine 117Threonine 119 Cysteine 121 Leucine 131 Isoleucine 131 Asparagine 132Aspartic acid 133 Glutamine 146 Lysine 146 Glutamic acid 147 Methionine149 Histidine (at pH 6.0) 155 Phenylalanine 165 Arginine 174 Tyrosine181 Tryptophan 204

[0202] Particularly preferred substitutions are:

[0203] Lys for Arg and vice versa such that a positive charge may bemaintained;

[0204] Glu for Asp and vice versa such that a negative charge may bemaintained;

[0205] Ser for Thr such that a free —OH can be maintained; and

[0206] Gln for Asn such that a free NH₂ can be maintained.

[0207] Amino acid substitutions may also be introduced to substitute anamino acid with a particularly preferable property. For example, a Cysmay be introduced into a potential site for disulfide bridges withanother Cys. A His may be introduced as a particularly “catalytic” site(i.e., His can act as an acid or base and is the most common amino acidin biochemical catalysis). Pro may be introduced because of itsparticularly planar structure, which induces β-turns in the protein'sstructure.

[0208] Two amino acid sequences are “substantially homologous” when atleast about 70% of the amino acid residues (preferably at least about80%, and-most preferably-at least about 90 or 95%) are identical, orrepresent conservative substitutions.

[0209] A “heterologous” region of the DNA construct is an identifiablesegment of DNA within a larger DNA molecule that is not found inassociation with the larger molecule in nature. Thus, when theheterologous region encodes a mammalian gene, the gene will usually beflanked by DNA that does not flank the mammalian genomic DNA in thegenome of the source organism. Another example of a heterologous codingsequence is a construct where the coding sequence itself is not found innature (e.g., a cDNA where the genomic coding sequence contains introns,or synthetic sequences having codons different than the native gene).Allelic variations or naturally-occurring mutational events do not giverise to a heterologous region of DNA as defined herein.

[0210] An “antibody” is any immunoglobulin, including antibodies andfragments thereof, that binds a specific epitope. The term encompassespolyclonal, monoclonal, and chimeric antibodies, the last mentioneddescribed in further detail in U.S. Pat. Nos. 4,816,397 to Boss et al.and 4,816,567 to Cabilly et al.

[0211] An “antibody combining site” is that structural portion of anantibody molecule comprised of heavy and light chain variable andhypervariable regions that specifically binds antigen.

[0212] The phrase “antibody molecule” in its various grammatical formsas used herein contemplates both an intact immunoglobulin molecule andan immunologically active portion of an immunoglobulin molecule.Exemplary antibody molecules are intact inmunoglobulin molecules,substantially intact immunoglobulin molecules and those portions of animmunoglobulin molecule that contains the paratope, including thoseportions known in the art as Fab, Fab′, F(ab′)₂ and F(v), which portionsare preferred for use in the therapeutic methods described herein. Faband F(ab′)₂ portions of antibody molecules are prepared by theproteolytic reaction of papain and pepsin, respectively, onsubstantially intact antibody molecules by methods that are well-known.See for example, U.S. Pat. No. 4,342,566 to Theofilopolous et al. Fab′antibody molecule portions are also well-known and are produced fromF(ab′)₂ portions followed by reduction of the disulfide bonds linkingthe two heavy chain portions as with mercaptoethanol, and followed byalkylation of the resulting protein mercaptan with a reagent such asiodoacetamide. An antibody containing intact antibody molecules ispreferred herein.

[0213] The phrase “monoclonal antibody” in its various grammatical formsrefers to an antibody having only one species of antibody combining sitecapable of immunoreacting with a particular antigen. A monoclonalantibody thus typically displays a single binding affinity for anyantigen with which it immunoreacts. A monoclonal antibody may therefore;contain an antibody molecule having a plurality of antibody combiningsites, each immunospecific for a different antigen; e.g., a bispecific(chimeric) monoclonal antibody.

[0214] A DNA sequence is “operatively linked” to an expression controlsequence when the expression control sequence controls and regulates thetranscription and translation of that DNA-sequence. The term“operatively linked” includes having an appropriate start signal (e.g.,ATG) in front of the DNA sequence to be expressed and maintaining thecorrect reading frame to permit expression of the DNA sequence under thecontrol of the expression control sequence and production of the desiredproduct encoded by the DNA sequence. If a gene that one desires toinsert into a recombinant DNA molecule does not contain an appropriatestart signal, such a start signal can be inserted in front of the gene.

[0215] The term “standard hybridization condition's” refers to salt andtemperature conditions substantially equivalent to 5×SSC and 65° C. forboth hybridization and wash. However, one skilled in the art willappreciate that: such “standard hybridization conditions” are dependenton particular conditions including the concentration of sodium andmagnesium in the buffer, nucleotide sequence, length and concentration,percent mismatch, percent formamide, and the like. Also important in thedetermination of “standard hybridization conditions” is whether the twosequences hybridizing are RNA-RNA, DNA-DNA or RNA-DNA. Such standardhybridization conditions are easily determined by one skilled in the artaccording to well known formulae, wherein hybridization is typically10-20° C. below the predicted or determined T_(m) with washes of higherstringency, if desired.

[0216] In its primary aspect, the present invention concerns theidentification of a class of DNA Polymerase III-type enzymes orcomplexes found in thermophilic bacteria such as Thermus thermophilus(T.th.), Aquifex aeolicus (A.ae), Thermotoga maritima (T.ma.), Bacillusstearothermophilus (B.st) arid other eubacteria which exhibit thefollowing characteristics, among their properties: the ability to extenda primer over a-long stretch of ssDNA at elevated temperature,stimulation by its cognate sliding clamp of the type that is assembledon DNA by a clamp loader, accessory subunits that exhibit DNA-stimulatedATPase activity at elevated temperature and/or ionic strength, and anassociated 3′-5′ exonuclease activity. In a particular aspect, theinvention extends to Polymerase III-type enzymes derived from a broadclass of thermophilic eubacteria that include polymerases isolated fromthe thermophilic bacteria Aquifex aeolicus (A.ae polymerase) and othermembers of the Aquifex genus; Thermus thermophilus (T.th. polymerase),Thermus favus (Tfl/Tub polymerase), Thermus ruber (Tru polymerase),Thermus brockianus (DYNAZYME™ polymerase) and other members of theThermus genus; Bacillus stearothermophilus (Bst polymerase) and othermembers of the Bacillus genus; Thermoplasma acidophilum (Tac polymerase)and other members of the Thermoplasma genus; and Thermotoga neapolitana(Tne polymerase; See WO 96/10640 to Chatteijee et al.), Thermotogamaritima (Tma polymerase; See U.S. Pat. No. 5,374,553 to Gelfand etal.), and other members of the Thermotoga genus. The particularpolymerase discussed herein by way of illustration and not limitation,is the enzyme derived from T.th., A.ae, T.ma., or B.st.

[0217] Polymerase III-type enzymes covered by the invention includethose that may be prepared by purification from cellular material, asdescribed in detail in the Examples infra, as well as enzyme assembliesor complexes that comprise the combination of individually preparedenzyme subunits or components. Accordingly, the entire enzyme may beprepared by purification from cellular material, or may be constructedby the preparation of the individual components and their assembly intothe functional enzyme. A representative and non-limitative protocol forthe preparation of an enzyme by this latter route is set forth in U.S.Pat. No. 5,583,026 to O'Donnell, and the disclosure thereof isincorporated herein in its entirety for such purpose.

[0218] Likewise, individual subunits may be modified, e.g. as byincorporation therein of single residue substitutions to create activesites therein, for the purpose of imparting new or enhanced propertiesto enzymes containing the modified subunits (see, e.g., Tabor, 1995).Likewise, individual subunits prepared in accordance with the invention,may be used individually and for example, may be substituted for theircounterparts in other enzymes, to improve or particularize theproperties of the resultant modified enzyme. Such modifications arewithin the skill of the art and are considered to be included within thescope of the present invention.

[0219] Accordingly, the invention includes the various subunits that maycomprise the enzymes, and accordingly extends to the genes andcorresponding proteins that may be encoded thereby, such as the α (aswell as PolC), β, γ, ε, τ, δ and δ′ subunits, respectively. Moreparticularly, in Thermus thermophilus the α subunit corresponds to dnaE,the β subunit corresponds to dnaN, the ε subunit corresponds to dnaQ,and the γ and τ subunits correspond to dnaX, the δ subunit correspondsto holA, and the δ′ subunit corresponds to holB. In Aquifex aeolicus andThermotoga maritima, the α subunit corresponds to dnaE, the β subunitcorresponds to dnaN, the ε subunit corresponds to dnaQ, the τ subunitcorresponds to dnaX, the δ subunit corresponds to holA, and the δ′subunit corresponds to holB. In Bacillus stearothermophilus, the PolCwhich has both α and ε activities corresponds to polC, the β subunitcorresponds to dnaN, the ε subunit corresponds to dnaQ, the τ subunitcorresponds to dnaX, the δ subunit corresponds to holA, and the δ′subunit corresponds to holB.

[0220] Accordingly, the Polymerase III-type enzyme of the presentinvention comprises at least one gene encoding a subunit thereof, whichgene is selected from the group consisting of dnaX, dnaQ, dnaE, dnaN,holA, holB, and combinations thereof. More particularly, the inventionextends to the nucleic acid molecule encoding them and their encodedsubunits.

[0221] In the T.th. Pol III enzyme, this includes the followingnucleotide sequences: dnaX (SEQ. ID. No. 3), dnaE (SEQ. ID. No. 86),dnaQ (SEQ. ID. No. 94), dnaN (SEQ. ID. No. 106), holA (SEQ. ID. No.157), and holB (SEQ. ID. No. 155).

[0222] In the A.ae Pol III enzyme, this includes the followingnucleotide sequences: dnaX (SEQ. ID. No. 119), dnaE (SEQ. ID. No. 117),dnaQ (SEQ. ID. No. 127), dnaN (SEQ. ID. No. 121), holA (SEQ. ID. No.123), and holB (SEQ. ID. No. 125).

[0223] In the T.ma. Pol III enzyme, this includes the followingnucleotide sequences: dnaX (SEQ. ID. No. 141), dnaE (SEQ. ID. No. 137),dnaQ (SEQ. ID. No. 139), dnaN (SEQ. ID. No. 143), holA (SEQ. ID. No.145), and holB (SEQ. ID. No. 147).

[0224] In the B.st. Pol III enzyme, this includes the followingnucleotide sequences: dnaX (SEQ. ID. No. 181), dnaN (SEQ. ID. No. 173),holA (SEQ. ID. No. 177), holB (SEQ. ID. No. 179), and polC (SEQ. ID.Nos. 183).

[0225] In each of the Pol III type enzymes of the present invention, notonly are each of the above-identified coding sequences contemplated, butalso conserved variants, active fragments and analogs thereof.

[0226] A particular T.th. Polymerase III-type enzyme in accordance withthe invention may include at least one of the following sub-units: a γsubunit having an amino acid sequence corresponding to SEQ. ID. Nos. 4and 5; a τ subunit-having an amino acid sequence corresponding to SEQ.ID. No. 2; a ε subunit having an amino acid sequence corresponding toSEQ. ID. No. 95; a α subunit including an amino acid sequencecorresponding SEQ. ID. No. 87; a β subunit having an amino acid sequencecorresponding to SEQ. ID. No. 107; a δ subunit having an amino acidsequence corresponding to SEQ. ID. No. 158; a δ′ subunit having an aminoacid sequence corresponding to SEQ. ID. No. 156; as well as variants,including allelic variants, muteins, analogs and fragments of any of thesubunits, and compatible combinations thereof, capable of functioning inDNA amplification and sequencing.

[0227] A particular A.ae Polymerase III-type enzyme in accordance with-the invention may include at least one of the following sub-units: a τsubunit having an amino acid sequence corresponding to SEQ. ID. No. 120;a ε subunit having an amino acid sequence corresponding to SEQ. ID. No.128; a α subunit including an amino acid sequence corresponding to SEQ.ID. No. 118; a β subunit having an amino acid sequence corresponding toSEQ. ID. No. 122; a δ subunit having an amino acid sequencecorresponding to SEQ. ID.-No. 124; a δ′ subunit having an amino acidsequence corresponding to SEQ. ID. No. 126; as well as variants,including allelic variants, muteins, analogs and fragments of any of thesubunits, and compatible combinations thereof, capable of functioning inDNA amplification and sequencing.

[0228] A particular T.ma. Polymerase III-type enzyme in accordance withthe invention may include at least one of the following sub-units: a τsubunit having an amino acid sequence corresponding to SEQ. ID. No. 142;a ε subunit having an amino acid sequence corresponding to SEQ. ID. No.140; a α subunit including an amino acid sequence corresponding to SEQ.ID. No. 138; a β subunit having an amino acid sequence corresponding toSEQ. ID. No. 144; a δ subunit having an amino acid sequencecorresponding to SEQ. ID. No. 146; a δ′ subunit having an amino acidsequence corresponding to SEQ. ID. No. 148; as well as variants,including allelic variants, muteins, analogs and fragments of any of thesubunits, and compatible combinations thereof, capable of functioning inDNA amplification and sequencing.

[0229] A particular B.st. Polymerase III-type enzyme in accordance withthe invention may include at least one of the following subunits: a τsubunit having a partial amino acid sequence corresponding to SEQ. ID.No. 182; a β subunit-having an amino acid sequence corresponding to SEQID. No. 174; a δ subunit having an amino acid sequence corresponding toSEQ. ID. No. 178; a δ′ subunit having an amino acid sequencecorresponding to SEQ. ID. No. 180; a PolC subunit having an amino acidsequence corresponding to SEQ. ID. Nos. 184; as well as variants,including allelic variants, muteins, analogs and fragments of any of thesubunits, and compatible combinations thereof, capable of functioning inDNA amplification and sequencing.

[0230] The invention also includes and extends to the use andapplication of the enzyme and/or one or more of its components for DNAmolecule amplification and sequencing by the methods set forthhereinabove, and in greater detail later on herein.

[0231] One of the subunits of the invention is the T.th. γ/τ subunitencoded by a dnaX gene, which frameshifts as much as −2 with highefficiency, and that, upon frameshifting, leads to the addition of morethan one extra amino acid residue to the C-terminus (to form the γsubunit). Further, the invention likewise extends to a dnaX gene derivedfrom a thermophile such as T.th., that possesses the frameshift definedherein and that codes for expression of the γ and τ subunits of DNAPolymerase III.

[0232] The present invention provides methods for amplifying orsequencing a nucleic acid molecule comprising contacting the nucleicacid molecule with a composition comprising a DNA polymerase III enzyme(DNA pol III) complex (for sequencing, preferably a DNA pol III complexthat is substantially reduced in 3′-5′ exonuclease activity). DNA polIII complexes used in the methods of the present invention arethermostable.

[0233] The invention also provides DNA molecules amplified by thepresent methods, methods of preparing a recombinant vector comprisinginserting a DNA molecule amplified by the present methods into a vector,which is preferably an expression vector, and recombinant vectorsprepared by these methods.

[0234] The invention also provides methods of preparing a recombinanthost cell comprising inserting a DNA molecule amplified by the presentmethods into a host cell, which preferably a bacterial cell, mostpreferably an Escherichia coli cell; a yeast cell; or an animal cell,most preferably an insect cell, a nematode cell or a mammalian cell. Theinvention also provides and recombinant host cells prepared by thesemethods.

[0235] In additional preferred embodiments, the present inventionprovides kits for amplifying or sequencing a nucleic acid molecule. DNAamplification kits according to the invention comprise a carrier meanshaving in close confinement therein two or more container means, whereina first container means contains a DNA polymerase III enzyme complex anda second container means contains a deoxynucleoside triphosphate. DNAsequencing kits according to the present invention comprise amulti-protein Pol III-type enzyme complex and a second container meanscontains a dideoxynucleoside triphosphate. The DNA pol III contained inthe container means of such kits is preferably substantially reduced in5′-3′ exonuclease activity, may be thermostable, and may be isolatedfrom the thermophilic cellular sources described above.

[0236] DNA pol III-type enzyme complexes for use in the presentinvention may be isolated from any organism that produced the DNA polIII-type enzyme complexes naturally or recombinantly. Such enzymecomplexes may be thermostable, isolated from a variety of thermophilicorganisms.

[0237] The thermostable DNA polymerase III-type enzymes or complexesthat are an important aspect of this invention, may be isolated from avariety of thermophilic bacteria that are available commercially (forexample, from American Type Culture Collection, Rockville, Md.).Suitable for use as sources of thermostable enzymes are the thermophiliceubacteria Aquifex aeolicus and other species of the Aquifex genus;Thermus aquaticus, Thermus thermophilus, Thermus flavus, Thermus ruber,Thermus brockianus, and other species of the Thermus genus; Bacillusstearothermophilus, Bacillus subtilis, and other species of the Bacillusgenus; Thermoplasma acidophilum and other species of the Thermoplasmagenus; Thermotoga neapolitana, Thermotoga maritima and other species ofthe Thermotoga genus; and mutants of each of these species. It will beunderstood by one of ordinary skill in the art, however, that anythermophilic microorganism might be used as a source of thermostable DNApol III-type enzymes and polypeptides for use in the methods of thepresent invention. Bacterial cells maybe grown according to standardmicrobiological techniques, using culture media and incubationconditions suitable for growing active cultures of the particularthermophilic species that are well-known to one of ordinary skill in theart (see, e.g., Brock et al., 1969; Oshima et al., 1974). ThermostableDNA pol III complexes may then be isolated from such thermophiliccellular sources as described for thermolabile complexes above.

[0238] Several methods are available for identifying homologous nucleicacids and protein subunits in other thermophilic eubacteria, eitherthose listed above or otherwise. These methods include the following:

[0239] (1) The following procedure was used to obtain the genes encodingT.th. ε (dnaQ), τ/γ (dnaX), DnaA (dnaA), and β (dnaN). Protein sequencesencoded by genes of non-thermophilic bacteria (i.e., mesophiles) arealigned to identify highly conserved amino acid sequences. PCR primersat conserved positions are designed using the codon usage of theorganism of interest to amplify an internal section of the gene fromgenomic DNA extracted from the organism. The PCR product is sequenced.New primers are designed near the ends of the sequence to obtain newsequence that flanks the ends using circular PCR (also called inversedPCR) on genomic DNA that has been cut with the appropriate restrictionenzyme and ligated into circles. These new PCR products are sequenced.The procedure is repeated until the entire gene sequence has beenobtained. Also, dnaN (encoding β) is located next to dnaA in bacteriaand, therefore, dnaN can be obtained by cloning DNA flanking the dnaAgene by the circular PCR procedure starting within dnaA. Once the geneis obtained, it is cloned into an expression vector for proteinproduction.

[0240] (2) The following procedure was used to obtain the genes encodingT.th α polymerase (dnaE gene). The DNA polymerase III can be purifieddirectly from the organism of interest and amino acid sequence of thesubunit(s) obtained directly. In the case of T.th., T.th. cells werelysed and proteins were fractionated. An antibody against E. coli α wasused to probe column fractions by Western analysis, which reacted withT.th. α. The T.th. α was transferred to a membrane, proteolyzed, andfragments were sequenced. The sequence was used to design PCR primersfor amplification of an internal section of the dnaE gene. Remainingflanking sequences are then obtained by circular PCR.

[0241] (3) The following procedure can be used to identify publishednucleictide sequences which have not yet been identified as to theirfunction. This method was used to obtain T.th. δ (holA) and δ′ (holB),although they could presumably also have been obtained via Methods 1 and2 above. Discovery of T.th. dnaE (α), dnaN (β) and dnaX (τ/γ) indicatesthat thermophiles use a class III type of DNA polymerase (α) thatutilize a clamp (β) and must also use a clamp loader since they haveτ/γ. Also, the biochemical experiments in the Examples infra show thatthe T.th. polymerase functions with the T.th. β clamp. Havingdemonstrated that a thermophile (e.g., T.th.) does indeed utilize aclass III type of polymerase with a clamp and clamp loader, it can beassumed that they may have δ and δ′ subunits needed to form a complexwith τ/γ for functional clamp loading activity (i.e., as shown in E.coli, δ and δ′ bind either τ or γ to form τδδ′ or γδδ′ complex, both ofwhich are functional clamp loaders). The δ subunit is not very wellconserved, but does give a match in the sequence databases for A.ae,T.ma, and T.th. The T.th. database provided limited information on theamino acid sequence of δ subunit, although one can easily obtain thecomplete sequence of T.th. holA by PCR and circular PCR as outlinedabove in Method 1. The A.ae and T.ma. databases are complete and,therefore, the entire holA sequence from these genomes are identified.Neither database recognized these sequences as δ encoded by holA. The δ′subunit (holB) is fairly well conserved. Again the incomplete T.th.database provided limited δ′ sequence, but as with δ, it is a straightforward process for anyone experienced in the area to obtain the rest ofthe holB sequence using PCR and circular PCR as described in Method 1.Neither the A.ae nor T.ma. databases recognized holB encoding δ′.Nevertheless, holB was identified as encoding δ′ by searching thedatabases with δ′ sequence. In each case, the Thermatoga maritima andAquifex aeolicus holB gene and δ′ sequence were obtained in theirentirety. Neither database had previously annotated holA or holBencoding δ and δ′.

[0242] As stated above and in accordance with the present invention,once nucleic acid molecules have been obtained, they may be amplifiedaccording to any of the literature-described manual or automatedamplification methods. Such methods includes, but are not limited to,PCR (U.S. Pat. No. 4,683,195 to Mullis et al. and U.S. Pat. No.4,683,202 to Mullis), Strand Displacement Amplification (SDA) (U.S. Pat.No. 5,455,166 to Walker), and Nucleic Acid Sequence-Based-.Amplification: (NASBA) (U.S. Pat. No. 5,409,81.8 to Davey et al.; EP329,822 to Davey et al.). Most preferably, nucleic acid molecules areamplified by the methods of the present invention using PCR-basedamplification techniques.

[0243] In the initial steps of each of these amplification methods, thenucleic acid molecule to be amplified is contacted with a compositioncomprising a DNA polymerase belonging to the evolutionary “family A”class (e.g., Taq DNA pol I or E. coli pol I) or the “family “B” class(e.g., Vent and Pfu DNA polymerases—see Ito and Braithwaite, 1991). Allof these DNA polymerases are present as single subunits and areprimarily involved in DNA repair. In contrast, the DNA pol III-typeenzymes are multisubunit complexes that mainly function in thereplication of the chromosome, and the subunit containing the DNApolymerase activity is in the “family C” class.

[0244] Thus, in amplifying a nucleic acid molecule according to themethods of the present invention, the nucleic acid molecule is contactedwith a composition comprising a thermostable DNA pol III-type enzymecomplex.

[0245] Once the nucleic acid molecule to be amplified is contacted withthe DNA pol III-type complex, the amplification reaction may proceedaccording to standard protocols for each of the above-describedtechniques. Since most of these techniques comprise a high-temperaturedenaturation step, if a thermolabile. DNA pol III-type enzyme complex isused in nucleic acid amplification by any of these techniques the enzymewould need to be added at the start of each amplification cycle, sinceit would be heat-inactivated at the denaturation step. However? athermostable DNA pol III-type complex used in these methods need only beadded once at the start of the amplification (as for Taq DNA polymerasein traditional PCR amplifications), as its activity will be unaffectedby the high temperature of the denaturation step. It should be noted,however, that because DNA pol III-type enzymes may have a much morerapid rate of nucleotide incorporation than the polymerases commonlyused in these amplification techniques, the cycle times may need to beadjusted to shorter intervals than would be standard.

[0246] In an alternative preferred embodiment, the invention providesmethods of extending primers for several kilobases, a reaction that iscentral to amplifying large nucleic acid molecules, by a techniquecommonly referred to as “long chain PCR” (Barnes, 1994; Cheng, 1994).

[0247] In such a method the target primed DNA can contain a singlestrand stretch of DNA to be copied into the double strand form ofseveral or tens of kilobases. The reaction is performed in a suitablebuffer, preferably Tris, at a pH of between 5.5-9.5, preferably 7.5. Thereaction also contains MgCl₂ in the range 1 mM to 10 mM, preferably 8mM, and may contain a suitable salt such as NaCl, KCl or sodium orpotassium acetate. The reaction also contains ATP in the range of 20 μMto 1 mM, preferably 0.5 mM, that is needed for the clamp loader toassemble the clamp onto the primed template, and a sufficientconcentration of deoxynucleoside triphosphates in the range of 50 μM to0.5 mM, preferably 60 μM for chain extension. The reaction contains asliding clamp, such as the β subunit, in the range of 20 ng to 200 ng,preferably 100 ng, for action as a clamp to stimulate the DNApolymerase. The chain extension reaction contains a DNA polymerase and aclamp loader, that could be added either separately or as a single PolIII*-like particle, preferably as a Pol III* like particle that containsthe DNA polymerase and clamp loading activities. The Pol III-type enzymeis added preferably at a concentrations of about 0.0002-200 units permilliliter, about 0.002-100 units per milliliter, about 0.2-50 units permilliliter, and most preferably about 2-50 units per milliliter. Thereaction is incubated at elevated temperature, preferably 60° C. ormore, and could include other proteins to enhance activity such as asingle strand DNA binding protein.

[0248] In another preferred embodiment,.the invention provides methodsof extending primers on linear templates in the absence of the clamploader. In this reaction, the primers are annealled to the linear DNA,preferably at the ends such as in standard PCR applications. Thereaction is performed in a suitable buffer, preferably Tris, at a pH ofbetween 5.5-9.5, preferably 7.5. The reaction also contains MgCl₂ in therange of 1 mM to 10 mM, preferably 8 mM, and may contain a suitable saltsuch as NaCl, KCl or sodium or potassium acetate. The reaction alsocontains a sufficient concentration of deoxynucleoside triphosphates inthe range of 50 μM to 0.5 mM, preferably 60 μM for chain extension. Thereaction contains a sliding clamp, such as the β subunit, in the rangeof 20 ng to 20 μg, preferably about 2 μg, for ability to slide on theend of the DNA and associate with the polymerase for action as a clampto stimulate the DNA polymerase. The chain extension reaction alsocontains a Pol III-type polymerase subunit such as α, core, or a PolIII*-like particle. The Pol III-type enzyme is added preferably at aconcentrations of about 0.0002-200 units per milliliter, about 0.002-100units per milliliter, about 0.2-50 units per milliliter, and mostpreferably about 2-50 units per milliliter. The reaction is incubated atelevated temperature, preferably 60° C. or more, and could include otherproteins to enhance activity such as a single strand DNA bindingprotein.

[0249] The methods of the present invention thus will providehigh-fidelity amplified copies of a nucleic acid molecule in a morerapid fashion than traditional amplification methods using therepair-type enzymes.

[0250] These amplified nucleic acid molecules may then be manipulatedaccording to standard recombinant DNA techniques. For example, a nucleicacid molecule amplified according to the present methods may be insertedinto a vector, which is preferably an expression vector, to produce arecombinant vector comprising the amplified nucleic acid molecule. Thisvector may then be inserted into a host cell, where it may, for example,direct the host cell to produce a recombinant polypeptide encoded by theamplified nucleic acid molecule. Methods for inserting nucleic acidmolecules into vectors, and inserting these vectors into host cells, arewell-known to one of ordinary skill in the art (see, e.g., Maniatis,1992).

[0251] Alternatively, the amplified nucleic acid molecules may bedirectly inserted into a host cell, where it may be incorporated intothe host cell genome or may exist as an extra chromosomal nucleic acidmolecule, thereby producing a recombinant host cell. Methods forintroduction of a nucleic acid molecule into a host cell, includingcalcium phosphate transfection, DEAE-dextran mediated transfection,cationic lipid-mediated transfection, electroporation, transduction,infection or other methods, are described in many standard laboratorymanuals (see, e.g., Davis, 1986).

[0252] For each of the above techniques wherein an amplified nucleicacid molecule is introduced into a host cell via a vector or via directintroduction, preferred host cells include but are not limited to abacterial cell, a yeast cell, or an animal cell. Bacterial host cellspreferred in the present invention are E. coli, Bacillus spp.,Streptomyces spp., Erwinia spp., Klebsiella spp. and Salmonellatyphimurium. Preferred as a host cell is E. coli, and particularlypreferred are E. coli strains DH10B and Stbl2, which are availablecommercially (Life. Technologies, Inc. Gaithersburg, Md.). Preferredanimal host cells are insect cells, nematode cells and mammalian cells.Insect host cells preferred in the present invention are Drosophila spp.cells, Spodoptera Sf9 and Sf21 cells, and Trichoplusa High-Five cells,each of which is available commercially (e.g., from Invitrogen; SanDiego, Calif.). Preferred nematode host cells are those derived from C.elegans, and preferred mammalian host cells are those derived fromrodents, particularly rats, mice or hamsters, and primates, particularlymonkeys and humans. Particularly preferred as mammalian host cells areCHO cells, COS cells and VERO cells.

[0253] By the present invention, nucleic acid molecules may be sequencedaccording to any of the literature-described manual or automatedsequencing methods. Such methods include, but are not limited to,dideoxy sequencing methods such as “Sanger sequencing” (Sanger andCoulson, 1975; Sanger et al., 1977; U.S. Pat. No. 4,962,022 to Fleminget al.; and U.S. Pat. No. 5,498,523 to Tabor et al.), as well as morecomplex PCR-based nucleic acid fingerprinting techniques such as RandomAmplified-Polymorphic DNA (RAPD) analysis (Williams et al., 1990).Arbitrarily Primed PCR (AP-PCR) (Welsh and McClelland, 1990), DNAAmplification Fingerprinting (DAF) (Caetano-Anollés, 1991),microsatellite PCR or Directed Amplification of Minisatellite-region DNA(DAMD) (Heath et al., 1993), and Amplification, Fragment LengthPolymorphism (AFLP) analysis (EP 534,858 to Vos et al.; Vos et al.,1995; Lin and Kuo, 1995).

[0254] As described above for amplification methods, the nucleic acidmolecule to be sequenced by these methods is typically contacted with acomposition comprising a type A or type B DNA polymerase. By contrast,in sequencing a nucleic acid molecule according to the methods of thepresent invention, the nucleic acid molecule is contacted with acomposition comprising a thermostable DNA pol III-type enzyme complexinstead of necessarily using a DNA polymerase of the family A or Bclasses. As for amplification methods, the DNA pol III-type complexesused in the nucleic acid sequencing methods of the present invention arepreferably substantially reduced in 3′-5′ exonuclease activity; mostpreferable for use in the present methods is a DNA polymerase III-typecomplex which lacks the ε subunit. DNA pol III-type complexes used fornucleic acid sequencing according to the present methods are used at thesame preferred concentration ranges described above for long chainextension of primers.

[0255] Once the nucleic acid molecule to be sequenced is contacted withthe DNA pol III complex, the sequencing reactions may proceed accordingto the protocols disclosed in the above-referenced techniques.

[0256] As discussed above, the invention extends to kits for use innucleic acid amplification or sequencing utilizing DNA polymeraseIII-type enzymes according to the present methods. A DNA amplificationkit according to the present invention may comprise a carter means, suchas vials, tubes, bottles and the like. A first such container means maycontain a DNA polymerase III-type enzyme complex, and a second suchcontainer means may contain a deoxynucleoside triphosphate. Theamplification kit encompassed by this aspect of the present inventionmay further comprise additional reagents and compounds necessary forcarrying out standard nucleic amplification protocols (See U.S. Pat. No.4,683,195 to Mullis et al. and U.S. Pat. No. 4,683,202 to Mullis, whichare directed to methods of DNA amplification by PCR).

[0257] Similarly, a DNA sequencing kit according to the presentinvention comprises a carrier means having in close confinement thereintwo or more container means, such as vials, tubes, bottles and the like.A first such container means may contain a DNA polymerase III-typeenzyme complex, and a second such container means may contain adideoxynucleoside triphosphate. The sequencing kit may further compriseadditional reagents and compounds necessary for carrying out standardnucleic sequencing protocols, such as pyrophosphatase, agarose orpolyacrylamide media for formulating sequencing gels, and othercomponents necessary for detection of sequenced nucleic acids (See U.S.Pat. No. 4,962,020 to Fleming et al. and U.S. Pat. No. 5,498,523 toTabor et al., which are directed to methods of DNA sequencing).

[0258] The DNA polymerase III-type complex contained in the firstcontainer means of the amplification and sequencing kits provided by theinvention is preferably a thermostable DNA polymerase III-type enzymecomplex and more preferably a DNA polymerase III-type enzyme complexthat is reduced in 3-5′ exonuclease activity. Naturally, the foregoingmethods and kits are presented as illustrative and not restrictive ofthe use and application of the enzymes of the invention for DNA moleculeamplification and sequencing. Likewise, the applications of specificembodiments of the enzymes, including conserved variants and activefragments thereof are considered to be disclosed and included within thescope of the invention.

[0259] As discussed earlier, individual subunits could be modified tocustomize enzyme construction and corresponding use and activity. Forexample, the region of α that interacts with β could be subcloned ontoanother DNA polymerase, thereby causing β to enhance the activity of therecombinant polymerase. Alternatively, the β clamp could be modified tofunction with another protein or enzyme thereby enhancing its activityor acting to localize its action to a particular targeted DNA. Finally,the polymerase active site could be modified to enhance its action, forexample changing Tyrosine enabling more equal site stoppage with thefour ddNTPs (Tabor et al., 1995). This represents a particularnon-limiting illustration of the scope and practice of the presentinvention with reference to the utility of individual subunits hereof.

[0260] Accordingly and as stated above, the present invention alsorelates to a recombinant DNA molecule or cloned gene, or a degeneratevariant thereof, which encodes any one or all of the subunits of the DNAPolymerase III-type enzymes of the present invention, or activefragments thereof. In the instance of the τ subunit, a predictedmolecular weight of about 58 kD and an amino acid sequence set forth inSEQ ID Nos. 4 or 5 is comprehended; preferably a nucleic acid molecule,in particular a recombinant DNA molecule or cloned gene, encoding the 58kD subunit of the Polymerase III of the invention, that has a nucleotidesequence or is complementary to a DNA sequence shown in FIGS. 4A and 4B(SEQ ID No. 1), and the coding region for dnaX set forth in FIG. 4C (SEQID No. 3). The γ subunit is smaller, and is approximately 50 kD,depending upon the extent of the frameshift that occurs. Moreparticularly, and as set forth in FIG. 4E (SEQ ID No. 4), the γ subunitdefined by a −1 frameshift possesses a molecular weight of 50.8 kD,while the γ subunit defined by a −2 frameshift, set forth in FIG. 4F(SEQ ID No. 5), possesses a molecular weight of 49.8 kD.

[0261] As discussed above, the invention also extends to the genesincluding holA, holB, dnaX, dnaQ, dnaE, and dnaN from thermophiliceubacteria (i.e., T.th. and. A.ae) that have been isolated and/orpurified, to corresponding vectors for the genes, and particularly, tothe vectors disclosed herein, and to host cells including such vectors.In this connection, probes have been prepared which hybridize to the DNApolymerase III-type enzymes of the present invention, and which areselected from the various oligonucleotide probes or primers set forth inthe present application. These include, without limitation, theoligonucleotide defined in SEQ ID No. 6 the oligonucleotide defined inSEQ ID No. 8 the oligonucleotide defined in SEQ ID No. 10 theoligonucleotide defined in SEQ ID No. 11 the oligonucleotide defined inSEQ ID No. 12 the oligonucleotide defined in SEQ ID No. 13 theoligonucleotide defined in SEQ ID No. 14 the oligonucleotide defined inSEQ ID No. 15, and the oligonucleotide defined in SEQ ID No. 16.

[0262] The methods of the invention include a method for producing arecombinant thermostable DNA polymerase III-type enzyme from athermophilic bacterium, such as T.th., A.ae, Th. ma., or B.st. whichcomprises culturing a host cell transformed with a vector of theinvention under conditions suitable for the expression of the presentDNA polymerase III. Another method includes a method for isolating atarget DNA fragment consisting essentially of a DNA coding for athermostable DNA polymerase III-type enzyme from a thermophilicbacterium comprising the steps of:

[0263] (a) forming a genomic library from the bacterium;

[0264] (b) transforming or transfecting an appropriate host cell withthe library of step (a);

[0265] (c) contacting DNA from the transformed or transfected host cellwith a DNA probe which hybridizes to a DNA fragment selected from thegroup consisting of the DNA fragments defined in SEQ ID No. 6 and theDNA fragments defined in SEQ ID No. 8 or the oligonucleotides set forthabove; wherein hybridization is conducted under the followingconditions:

[0266] i) hybridization: 1% crystalline BSA (fraction V) (Sigma), 1 mMEDTA, 0.5 M NaHPO4 (pH 7.2), 7% SDS at 65° C. for 12 hours and;

[0267] ii) wash: 5×20 minutes with wash buffer consisting of 0.5% BSA,fraction V), 1 mM Na2EDTA, 40 mM NaHPO4 (pH 7.2), and 5% SDS;

[0268] (d) assaying the transformed or transfected cell of step (c)which hybridizes to the DNA probe for DNA polymerase III-type activity;and

[0269] (e) isolating a target DNA fragment which codes for thethermostable DNA polymerase III-type enzyme.

[0270] Also, antibodies including both polyclonal and monoclonalantibodies, and the DNA Polymerase III-like enzyme complex and/or theirγ and τ subunits, α subunit(s), δ subunit; β subunit, β subunit, εsubunit may be used in the preparation of the enzymes of the presentinvention as well as other enzymes of similar thermophilic origin. Forexample, the DNA Polymerase III-type complex or its subunits may be usedto produce both polyclonal and monoclonal antibodies to themselves in avariety of cellular media, by known techniques such as the hybridomatechnique utilizing, for example, fused mouse spleen lymphocytes andmyeloma cells.

[0271] The general methodology for making monoclonal antibodies byhybridomas is well known. Immortal, antibody-producing cell lines canalso be created by techniques other than fusion, such as directtransformation of B lymphocytes with oncogenic DNA, or transfection withEpstein-Barr virus. See, e.g., Schreier et al., 1980; Hammerling et al.,1981; Kennett et al., 1980; see also U.S. Pat. No. 4,341,761 to Ganfieldet al.; U.S. Pat. No. 4,399,121 to Albarella et al.; U.S. Pat. No.4,427,783 to Newman et al.; U.S. Pat. No. 4,444,887 to Hoffman; U.S.Pat. No. 4,451,570 to Royston et al.; U.S. Pat. No. 4,466,917 toNussenzweig et al.; U.S. Pat. No. 4,472,500 to Milstein et al.; U.S.Pat. No. 4,491,632 to Wands et al.; and U.S. Pat. No. 4,493,890 toMorris.

[0272] Methods for producing polyclonal anti-polypeptide antibodies arewell-known in the art. See U.S. Pat. No. 4,493,795 to Nestor et al. Amonoclonal antibody, typically containing Fab and/or F(ab′)₂ portions ofuseful antibody molecules, can be prepared using the hybridomatechnology described in Antibodies—A Laboratory Manual, Harlow and Lane,eds., Cold Spring Harbor Laboratory, New York (1988), which isincorporated herein by reference. Briefly, to form the hybridoma fromwhich the monoclonal antibody composition is produced, a myeloma orother self-perpetuating cell line is fused with lymphocytes obtainedfrom the spleen of a mammal hyperimmunized with an elastin-bindingportion thereof.

[0273] A monoclonal antibody useful in practicing the present inventioncan be produced by initiating a monoclonal hybridoma culture comprisinga nutrient medium containing a hybridoma that secretes antibodymolecules of the appropriate antigen specificity. The culture ismaintained under conditions and for a time period sufficient for thehybridoma to secrete the antibody molecules into the medium. Theantibody-containing medium is then collected. The antibody molecules canthen be further isolated by well-known techniques.

[0274] Media useful for the preparation of these compositions are bothwell-known in the art and commercially available and include syntheticculture media, inbred mice and the like. An exemplary synthetic mediumis Dulbecco's minimal essential medium (DMEM) (Dulbecco et al., 1959)supplemented with 4.5 gm/l glucose, 20 mm glutamine, and 20% fetal calfserum. An exemplary inbred mouse strain is the Balb/c.

[0275] Another feature of this invention is the expression of the DNAsequences disclosed herein. As is well known in the art, DNA sequencesmay be expressed by operatively linking them to an expression controlsequence in an appropriate expression vector and employing thatexpression vector to transform an appropriate unicellular host.

[0276] Such operative linking of a DNA sequence of this invention to anexpression control sequence, of course, includes, if not already part ofthe DNA sequence, the provision of an initiation codon, ATG, in thecorrect reading frame upstream of the DNA sequence.

[0277] A wide variety of host/expression vector combinations may beemployed in expressing the DNA sequences of this invention. Usefulexpression vectors, for example, may consist of segments of chromosomal,non-chromosomal and synthetic DNA sequences. Suitable vectors includederivatives of SV40 and known bacterial plasmids, e.g., E. coli plasmidscol El, pCR1, pBR322, pMB9 and their derivatives, plasmids such as RP4;phage DNAS, e.g., the numerous derivatives of phage λ, e.g., NM989, andother phage DNA, e.g., M13 and filamentous single stranded phage DNA;yeast plasmids such as the 2μ plasmid or derivatives thereof; vectorsuseful in eukaryotic cells, such as vectors useful in insect ormammalian cells; vectors derived from combinations of plasmids and phageDNAs, such as plasmids that have been modified to employ phage DNA orother expression control sequences; and the like.

[0278] Any of a wide variety of expression control sequences—sequencesthat control the expression of a DNA sequence operatively linked toit—may be used in these vectors to express the DNA sequences of thisinvention. Such-useful expression control sequences include, forexample, the early or late promoters of SV40, CMV, vaccinia, polyoma oradenovirus, the lac system, the trp system, the TAC system, the TRCsystem, the LTR system, the major operator and promoter regions of phageλ, the control regions of fd coat protein, the promoter for3-phosphoglycerate kinase or other glycolytic enzymes, the promoters ofacid phosphatase (e.g., Pho5), the promoters of the yeast α-matingfactors, and other sequences known to control the expression of genes ofprokaryotic or eukaryotic cells or their viruses, and variouscombinations thereof.

[0279] A wide variety of unicellular host cells are also useful inexpressing the DNA sequences of this invention. These hosts may includewell known eukaryotic and prokaryotic hosts, such as strains of E. coli,Pseudomonas, Bacillus, Streptomyces, fungi such as yeasts, and animalcells, such as CHO, R1.1, B-W and L-M cells, African Green Monkey kidneycells (e.g., COS 1, COS 7, BSC1, BSC40, and BMT10), insect cells (e.g.,Sf9), and human cells and plant cells in tissue culture.

[0280] It will be understood that not all vectors, expression controlsequences and hosts will function equally well to express the DNAsequences of this invention. Neither will all hosts function equallywell with the same expression system; However, one skilled in the artwill be able to select the proper vectors, expression control sequences,and hosts without undue experimentation to accomplish the desiredexpression without departing from the scope of this invention. Forexample, in selecting a vector, the host must be considered because thevector must function in it. The vector's copy number, the ability tocontrol that copy number, and the expression of any other proteinsencoded by the vector, such as antibiotic markers, will also beconsidered.

[0281] In selecting an expression control sequence, a variety of factorswill normally be considered. These include, for example, the relativestrength of the system, its controllability, and its compatibility withthe particular DNA sequence or gene to be expressed, particularly withregard to potential secondary structures. Suitable unicellular hostswill be selected by consideration of, e.g., their compatibility with thechosen vector, their secretion characteristics, their ability to foldproteins correctly, and their fermentation requirements, as well as thetoxicity to the host of the product encoded by the DNA sequences to beexpressed, and the ease of purification of the expression products.

[0282] Considering these and other factors a person skilled in the art-will be able to construct a variety of vector/expression controlsequence/host combinations that will express the DNA sequences of thisinvention on fermentation or in large scale animal culture.

[0283] It is further intended that analogs may be prepared fromnucleotide sequences of the protein complex/subunit derived within thescope of the present invention. Analogs, such as fragments, may beproduced, for example, by pepsin digestion of bacterial material. Otheranalogs, such as muteins, can be produced by standard site-directedmutagenesis of dnaX, dnaE, dnaQ, dnaN, holA, or holB coding sequences.Especially useful may be a mutation in dnaE that provides the polymerasewith the ability to incorporate all four ddNTPs with equal efficiencythereby. producing an even binding pattern in sequencing gels, asdiscussed above and with reference to Tabor et al., 1995.

[0284] As mentioned above, a DNA sequence corresponding to dnaX, dnaQ,holA, holB, dnaE, or dnaN, or encoding the subunits of the DNAPolymerase III of the invention can be prepared synthetically ratherthan cloned. The DNA sequence can be designed with the appropriatecodons for the amino acid sequence of the subunit(s) of interest. Ingeneral, one will select preferred codons for the intended host if thesequence will be used for expression. The complete sequence is assembledfrom overlapping oligonucleotides prepared by standard methods andassembled into a complete coding sequence (Edge, 1981; Nambair et al.,1984; Jay et al.,1984).

[0285] Synthetic DNA sequences allow convenient construction of geneswhich will express DNA Polymerase III analogs or “muteins”.Alternatively, DNA encoding muteins can be made by site-directedmutagenesis of native dnaX, dnaQ, holA, holB, dnaE or dnaN genes ortheir corresponding cDNAs, and muteins can be made directly usingconventional polypeptide synthesis.

[0286] A general method for site-specific incorporation of unnaturalamino acids into proteins is described in Noren et al., 1989. Thismethod may be used to create analogs with unnatural amino acids.

GENERAL DESCRIPTION OF THE INVENTION

[0287] As discussed above, the present invention has as one of itscharacterizing features, that a Polymerase III-type enzyme as definedhereinabove, has been discovered in a thermophile, that has thestructure and function of a chromosomal replicase. This structure andfunction confers significant benefit when the enzyme is employed inprocedures such as PCR where speed and accuracy of DNA reconstruction iscrucial.

[0288] Chromosomal replicases are composed of several subunits in allorganisms (Kornberg and Baker, 1992). In keeping with the need toreplicate long chromosomes, replicases are rapid and highly processivemultiprotein machines. All cellular replicases examined to date derivetheir processivity from one subunit that is shaped like a ring andcompletely encircles DNA (Kuriyan and O'Dormell, 1993; Kelman andO'Donnell, 1994). This “sliding clamp” subunit acts as a mobile tetherfor the polymerase machine (Stukenberg et al., 1991). The sliding clampdoes not assemble onto the DNA by itself, but requires a complex ofseveral proteins, called a “clamp loader” which couples ATP hydrolysisto the assembly of sliding clamps onto DNA (O'Donnell et al., 1992).Hence, Pol III-type cellular replicases are comprised of threecomponents: a clamp, a clamp loader, and the DNA polymerase.

[0289] An overall goal is to identify and isolate all of the genesencoding the replicase subunits from a thermophile for expression andpurification in large quantity. Following this, the replicationapparatus can be reassembled from individual subunit components for usein kits, PCR, sequencing and diagnostic applications (Onrust et al.,1995).

[0290] As a beginning to identify and characterize the replicase of athermophile, we started by looking for a homologue to the prokaryoticdnaX gene which encode subunits (γ and τ) of the replicase. The dnaXgene has another homologue, holB, which encodes yet another subunit (δ′)of the replicase. The amino acid sequence of δ′ (encoded by holA) andτ/γ subunits (encoded by dnaX) are particularly highly conserved inevolution from prokaryotes to eukaryotes (Chen et al., 1992; O'Donnellet al., 1993; Onrust et al., 1993; Carter et al., 1993; Cullman et al.,1995).

[0291] One organism chosen for study and exposition herein is theexemplary extreme thermophile Thermus thermophilus (T.th.). It isunderstood that other members of the class such as the eubacteriumThermatoga are expected to be analogous in both structure and function.Thus, the investigation of T.th. proceeded and initially, a T.th.homologue of dnaX was identified. The gene encodes a full length proteinof 529 amino acids. The amino terminal third of the sequence shares over50% homology to dnaXgenes as divergent as E. coli (gram negative) and B.subtilis (gram positive). The T.th. dnaX gene contains a DNA sequencethat provides a translational frameshift signal for production of twoproteins from the same gene. Such frameshifting has been documented onlyin the case of E. coli (Tsuchihashi and Kornberg, 1990; Flower andMcHenry, 1990; Blinkowa and Walker, 1990). No frameshifting has beendocumented to occur in the dnaX homologues (RFC subunit genes) of yeastand humans (Eukaryotic kingdom).

[0292] The presence of a dnaX gene that produces two subunits impliesthat T.th. has a clamp loader (γ) and may be organized by τ into aPolIII*-type replicase like the replicative DNA polymerase ofEscherichia coli, DNA polymerase III holoenzyme. The E. coli DNApolymerase III holoenzyme contains 10 different subunits, some in copiesof two or more for a total composition of 18 polypeptide chains(Kornberg and Baker, 1992; Onrust et al., 1995). The holoenzyme iscomposed of-three major activities: the 3-subunit DNA polymerase core(αεθ), the β subunit DNA sliding clamp, and the 5-subunit γ complexclamp loader (γδδ′χψ). This 3 component strategy generalizes toeukaryotes which utilize a clamp (PCNA) and a 5-subunit RFC clamp loader(RFC) which provide processivity to DNA polymerase δ (reviewed in Kelmanand O'Donnell, 1994).

[0293] In E. coli, the polymerase and clamp loader components areorganized into one PolIII* particle by the τ subunit, that acts as a“glue” protein (Onrust et al., 1995). One dimer of τ holds together twocore polymerases in the particle which are utilized for the coordinatedand simultaneous replication of both strands of duplex DNA (McHenry,1982; Maki et al., 1988; Yuzhakov et al., 1996). The “glue” protein τsubunit also binds one clamp loader (called γ complex) thereby acting asa scaffold for a large superstructure assembly called DNA polymeraseIII*. The gene encoding τ, called dnaX, also encodes the γ subunit ofDNA polymerase III. The β subunit then associates with Pol III* to formthe DNA polymerase III holoenzyme. The γ subunit is approximately ⅔ thelength of τ. γ shares the N-terminus of τ, but is truncated by atranslational frameshifting mechanism that, after the shift, encountersa stop codon within two amino acids (Tsuchihashi and Kornberg, 1990;Flower and McHenry, 1990; Blinkowa and Walker, 1990). Hence, γ is theN-terminal 453 amino acids of τ, but contains one unique residue at theC-terminus (the penultimate codon encodes a Lys residue which isthe-same sequence as if the frameshift did not take place). Thisframeshift is highly efficient and occurs approximately 50% of the time.

[0294] The sequence of the γ and τ subunits encoded by the dnaX gene arehomologous to the clamp loading subunits in all other organismsextending from gram negative bacteria through gram positive bacteria,the Archeae Kingdom and the Eukaryotic Kingdom from yeast to humans(O'Donnell et al., 1993). All of these organisms utilize a threecomponent replicase (DNA polymerase, clamp and clamp loader) and inthese cases the 3 components appear to behave as independent units insolution rather than forming a large holoenzyme superstructure. Forexample, in eukaryotes from yeast to humans, the clamp loader is thefive subunit RFC, the clamp is PCNA, and the polymerases δ and ε are allstimulated by the PCNA clamp assembled onto primed DNA by RFC (reviewedin Kelman and O'Donnell 1994).

[0295] The discovery of a dnaX gene in T.th. provided confidence thatthermophilic bacteria would contain a three component Pol III-typeenzyme. Hence, we proceeded to identify the dnaQ and dnaN genesencoding, respectively, the proofreading 3′-5′ exonuclease, and the βDNA sliding clamp subunits of a Pol III-type enzyme. Following this, wepurified from extracts of T.th. cells, a Pol III-type enzyme. Thisenzyme preparation had the unique property of extending a single primeraround a long 7.2 kb single strand DNA genome of M13mp18 bacteriophage.Such a primer extension assay serves as a tool to detect and identifythe Pol III-type of enzyme in cell extracts. The enzyme was confirmed tobe a Pol III-type enzyme based on its reactivity with antibody directedagainst the E. coli α subunit (the DNA polymerase subunit) and antibodydirected against E. coli γ subunit. Proteins corresponding to α, τ, γ, δand δ′ were easily visible and lend themselves to identification of thegenes through use of peptide microsequencing followed by primer designfor PCR amplification. For example, from this DNA pol III-typepreparation, the peptide sequence of the α subunit was obtained, whichthen allowed the dnaE gene encoding the α subunit (DNA polymerase) ofthe Pol III-type enzyme to be obtain.

[0296] These methods should be widely applicable to other thermophilicbacteria. Additional antibody reagents against other Pol III-type enzymecomponents, such as RFC subunits, DNA polymerase delta, epsilon or beta,and the PCNA clamp from known organisms can be made quite easily aspolyclonal or monoclonal antibody preparations using as antigen eithernaturally purified sequence, recombinant sequence, or synthetic peptidesequence. Examples of known sequences of these Pol III-type enzymes areto be found in: DNA polymerases (Braithwaite and Ito, 1993), RFC clamploaders (Cullman et al., 1995) and PCNA (Kelman and O'Donnell, 1995).

[0297] The remaining genes of T.th. Pol III needed for efficientextension of primed templates, holA and holB, are now identified. TheholA coding sequence (SEQ. ID. No. 157) encodes the δ subunit (SEQ. ID.No. 158) and the holB coding sequence (SEQ. ID. No. 155) encodes the δ′subunit (SEQ. ID. No. 156). The holA and holB coding sequences and the δand δ′ subunits were identified via BLAST search (Altschul et al.,1997), and subsequently isolated following circular PCR. These geneswill provide the subunit preparations through use of standardrecombinant techniques and protein purification protocols. The proteinsubunits can then be used to reconstitute the enzyme complexes as theyexist in the cell. This type of reconstitution of Pol III has beendemonstrated using the protein subunits of DNA polymerase III holoenzymefrom E. coli to assemble the entire particle. See, e.g., U.S. Pat. Nos.5,583,026 and 5,668,004 to O'Donnell; and Onrust et al., 1995. Thedisclosures of these references are incorporated herein in theirentireties.

[0298] Another organism chosen for study and exposition herein is theextreme thermophile Aquifex aeolicus. Thus, the present invention alsorelates to various isolated DNA-molecules from Aquifex aeolicus, inparticular the DNA molecules encoding various replication proteins.These include dnaE, dnaX, dnaN, holA, holB, ssb DNA molecules from A.aeolicus. These DNA molecules can be inserted into an expression systemor used to transform host cells from which isolated proteins can beobtained. The isolated proteins encoded by these DNA molecules are alsodisclosed.

[0299] Unless otherwise indicated below, the Aquifex aeolicus sequenceswere obtained by sequence comparisons using the Thermus thermophiluscounterparts as query against the genome of Aquifex aeolicus (Deckert etal., 1998).

[0300] The A. aeolicus dnaE gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 117 and encodes the α subunit of the of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 118. The A.ae α subunit has approximately 41% aa identity to theT.th. α subunit.

[0301] The A. aeolicus dnaX gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 119. and encodes the τ subunit of the of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 120. The A.ae τ subunit has approximately 51% aa identity to theT.th. τ subunit.

[0302] The A. aeolicus dnaN gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 121 and encodes the, β subunit of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 122. The A.ae β subunit has approximately 27% aa identity to theT.th. β subunit.

[0303] The A. aeolicus dnaQ gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 127 and encodes the ε subunit of the of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 128. The A.ae ε subunit has approximately 26% aa identity to theT.th. ε subunit.

[0304] The A. aeolicus ssb gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 129 and encodes the SSB protein, which has anamino acid sequence according to SEQ. ID. No. 130. The A. ae SSB proteinhas approximately 22% aa identity to the T.th. SSB protein.

[0305] Further, the coding sequences of A. aeolicus genes encoding thehelicase (dnaB), helicase loader (dnaC); and primase (dnaG) are alsodisclosed. The A. aeolicus dnaB gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 131. and encodes the DnaB protein, whichfunctions as a helicase and has an amino acid sequence according to SEQ.ID. No. 132. The A. aeolicus. dnaG gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 133 and encodes the DnaG protein, whichfunctions as a primase and has an amino acid sequence according to SEQ.ID. No. 134. The A. aeolicus dnaC gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 135 and encodes the DnaC protein, whichfunctions as a helicase loader arid has an amino acid sequence accordingto SEQ. ID. No. 136.

[0306] The A. aeolicus holA and holB genes were previously unidentifiedby Deckert et al., 1998. Using Thermus thermophilus δ′ subunit aminoacid sequence and the Thermatoga maritima δ subunit amino acid sequence(SEQ. ID. No. 146 which itself was obtained using the T.th. δ subunitamino acid sequence of SEQ. ID. No. 158) in separate BLAST searches(Altschul et al., 1997), corresponding polypeptide products in Aquifexaeolicus were identified. The A. aeolicus holA gene has a nucleotidecoding sequence according to SEQ. ID. No., 123 and encodes the δ subunitof the of DNA Polymerase III, which has an amino acid sequence accordingto SEQ. ID. No. 124. The A.ae δ subunit has approximately 21% aaidentity to the T.m. δ subunit. The A. aeolicus holB gene has anucleotide coding sequence according to SEQ. ID. No. 125 and encodes theδ′ subunit of the of DNA Polymerase III, which has an amino acidsequence according to SEQ. ID. No. 126. The A.ae δ′ subunit hasapproximately 24% aa identity to the T.th. δ′ subunit.

[0307] This invention also clones at least the coding regions of a setof A. aeolicus genes which encode proteins that assemble into an A.aeolicus DNA polymerase III replication enzyme. These genes (dnaE, dnaN,dnaX, dnaQ, holA, holB, ssb) were cloned into expression vectors, theproteins were expressed in E. coli, and the corresponding proteinsubunits were purified (alpha, beta, tau, delta, delta prime, SSB). Thisinvention identifies the major protein-protein contacts among thesesubunits, shows how these proteins can be assembled into higher ordermultiprotein complexes, and how to form a rapid and processive DNApolymerase III holoenzyme.

[0308] In contrast to the E. coli and T. thermophilus dnaX genes whichencode both τ and γ subunits, the A. aeolicus dnaX gene produces onlythe full length τ subunit when expressed in E. coli. The A. aeolicus τis intermediate in length between the γ and τ subunits of E. coli DNApolymerase III holoenzyme. The E. coli τ binds α, the γ subunit does notbind α. Due to the intermediate size of A. aeolicus τ, it was not knownwhether the A. aeolicus τ would bind the α subunit. This invention showsthat indeed, the A. aeolicus τ binds to a, as well as δ and δ′, therebyforming an A. aeolicus ατδδ′ complex. Until the identification of the δand δ′ subunits by the present invention, their existence, let alonetheir interaction with τ and α, was not even known.

[0309] The A. aeolicus ατδδ′/β Pol III can be applied in several usefulDNA handling techniques. For example, the thermophilic Pol III will beuseful in DNA sequencing, especially at high temperature. Also, use of athermal resistant rapid and processive Pol III is an importantimprovement to polymerase chain reaction technology. The ability of theA. aeolicus Pol III to extend primers for multiple kilobases makespossible the amplification of very long segments of DNA (long chainPCR).

[0310] Another organism chosen for study and exposition herein is theextreme thermophile Thermotoga maritima. Thus, the present inventionalso relates to various isolated DNA molecules from Thermotoga maritima,in particular the DNA molecules encoding various replication proteins.These include dnaE, dnaX, dnaN, dnaQ, holA, holB, ssb DNA molecules fromThermotoga maritima. These DNA molecules can be inserted into anexpression system or used to transform host cells from which isolatedproteins can be obtained. The isolated proteins encoded by these DNAmolecules are also disclosed.

[0311] Unless otherwise indicated below, the Thermotoga maritimasequences were obtained by sequence comparisons using the Thermusthermophilus counterparts as query against the genome of Thermotogamaritima (Nelson et al., 1999).

[0312] The T. maritima dnaE gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 137 and encodes the α subunit of the of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 138. The T.m. α subunit has approximately 33% aa identity to theT.th. α subunit.

[0313] The T. maritima dnaQ gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 139 and encodes the ε subunit of the of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 140. The T.m. ε subunit has approximately 34% aa identity to theT.th. ε subunit.

[0314] The T. maritima dnaX gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 141 and encodes the τ subunit of the of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 142. The T.m. τ subunit has approximately 48% aa identity to theT.th. τ subunit.

[0315] The T. maritima dnaN gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 143 and encodes the β subunit of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 144. The T.m. β subunit has approximately 28% aa identity to theT.th. β subunit.

[0316] The T. maritima ssb gene has a nucleotide coding sequenceaccording to SEQ. ID. No. 149 and encodes the SSB protein, which has anamino acid sequence according to SEQ. ID. No. 150. The T.m. SSB proteinhas approximately 18% aa identity to the T.th. SSB protein.

[0317] Further, the coding sequences of T. maritima genes encoding thehelicase (dnaB) and primase (dnaG) are also disclosed. The T. maritimadnaB gene has a nucleotide coding sequence according to SEQ. ID. No. 151and encodes the DnaB protein, which functions as a helicase and has anamino acid sequence according to SEQ. ID. No. 152. The T. maritima dnaGgene has a nucleotide coding sequence according to SEQ. ID. No. 153 andencodes the DnaG protein, which functions as a primase and has an aminoacid sequence according to SEQ. ID. No. 154.

[0318] The T. maritima holA and holB genes were previously unidentifiedby Nelson et al., 1999). Using the Thermus thermophilus δ and δ′ subunitamino acid sequences (SEQ. ID. Nos. 158 and 156, respectively) inseparate BLAST searches (Altschul et al., 1997), correspondingpolypeptide products in T. maritima were identified. The T. maritimaholA gene has a nucleotide coding sequence according to SEQ. ID. No. 145and encodes the δ subunit of the of DNA Polymerase III, which has anamino acid sequence according to SEQ. ID. No. 146. The T.m. δ subunithas approximatey 37% aa identity to the T.th. δ subunit.: The T.m. holBgene has a nucleotide coding sequence according to SEQ. ID. No. 147 andencodes the δ′ subunit which has an amino acid sequence according toSEQ. ID. No. 148. The T.m. δ′ subunit has approximately 25% aa identityto the T.th. δ′ subunit.

[0319] Yet another organism chosen for study and exposition herein isthe extreme thermophile Bacillus stearothermophilus. Thus, the presentinvention also relates to various isolated DNA molecules from Bacillusstearothermophilus, in particular the DNA molecules encoding variousreplication proteins. These include dnaE, dnaX, dnaN, dnaQ, holA, holB,ssb DNA molecules from Bacillus stearothermophilus. These DNA moleculescan be inserted into an expression system or used to transform hostcells from which isolated proteins can be obtained. The isolatedproteins encoded by these DNA molecules are also disclosed.

[0320] Unless otherwise indicated below, the Bacillus stearothermophilussequences were obtained by searching the database of this organism (athttp://www.genome.ou.edu).

[0321] The B. stearothermophilus polC gene has a nucleotide codingsequence according to SEQ. ID. No. 183 and encodes the PolC or α-largesubunit of the DNA Polymerase III, which has an amino acid sequenceaccording to SEQ. ID. No. 184. The B.st. PolC subunit, like the PolCsubunits of other Gram positive organisms, contains both polymerase and3′-5′ exonuclease activity. This subunit, therefore, is essentially afusion of α and ε.

[0322] The B. stearothermophilus dnaX gene has a partial nucleotidecoding sequence according to SEQ. ID. No. 181 and encodes the τ subunitof the of DNA Polymerase III, which has a partial amino acid sequenceaccording to SEQ. ID. No. 182. The B.st. τ subunit has approximately 31%aa identity to the T.th. τ subunit.

[0323] The B. stearothermophilus dnaN gene has a partial nucleotidecoding sequence according to SEQ. ID. No. 173 and encodes the β subunitof DNA Polymerase III, which has a partial amino acid sequence accordingto SEQ. ID. No. 174. The B.st. β subunit has approximately 21% aaidentity to the T.th. β subunit.

[0324] The B. stearothermophilus ssb gene has a nucleotide codingsequence according to SEQ. ID. No. 175 and encodes the SSB protein,which has an amino acid sequence according to SEQ. ID. No. 176. TheB.st. SSB protein has approximately 23% aa identity to the T.th. SSBprotein.

[0325] The B. stearothermophilus holA gene has a nucleotide codingsequence according to SEQ. ID. No. 177 and encodes the δ subunit of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 178. The B.st. δ subunit has approximately 26% aa identity to theT.th. δ subunit.

[0326] The B. stearothermophilus holB gene has a nucleotide codingsequence according to SEQ. ID. No. 179 and encodes the δ′ subunit of DNAPolymerase III, which has an amino acid sequence according to SEQ. ID.No. 180. The B.st. δ′ subunit has approximately 25% aa identity to theT.th. δ′ subunit.

[0327] By conducting BLAST searches of unidentified genomic DNA fromother thermophilic eubacteria, it is possible to identify coding regionswhich encode various functional subunits of other Pol III replicativemachinery.

[0328] Although it is generally appreciated that proteins isolated froma thermophile should retain activity at high temperature, there is noguarantee that they will retain temperature resistance when isolated inpure form. This invention shows that the A. aeolicus Pol III, like theT. thermophilus Pol III, is resistant to high temperature. It isexpected that the Th. maritima and B. stearothermophilus Pol III enzymeswill similalry be resistant to high temperature.

[0329] The following experiments illustrate the identification andcharacterization of the enzymes and constructs of the present invention.Accordingly, in Examples 1-8 below, the identification and expression ofthe γ and τ is presented, as the first step in the elucidation of theThermus thermophilus Polymerase III reflective of the present invention.Examples 9-12 which follow set forth the protocol for the purificationof the remainder of the sub-units of the enzyme that representsubstantial entirety of the functional replicative machinery of theenzyme. Examples 18-30 demonstrate the preparation of isolated A.aeolicus sequences Pol III subunits and their thermostable use.

EXAMPLE 1

[0330] Experimental Procedures

[0331] Materials

[0332] DNA modification enzymes were from New England Biolabs. Labellednucleotides were from Amersham, and unlabeled nucleotides were from NewEngland Biolabs The Alter-1 vector was from Promega. pET plasmids and E.coli strains, BL21(DE3) and BL21(DE3)pLysS were from Novagen.Oligonucleotides were from Operon. Buffer A is 20 mM Tris-HCl (pH 7.5),0.1 mM EDTA, 5 mMDTT, and 10% glycerol.

[0333] Genomic DNA

[0334]Thermus thermophilus (strain HB8) was obtained from the AmericanType Tissue Collection. Genomic DNA was prepared from cells grown in 0.1l of Thermus medium N697 (ATCC: 4 g yeast extract, 8.0 g polypeptone(BBL 11910), 2.0 g NaCl, 30.0 g agar, 1.0 L distilled water) at 75° C.overnight. Cells were collected by centrifugation at 4° C. and the cellpellet was resuspended in 25 ml of 100 mM Tris-HCl (pH 8.0), 0.05 MEDTA, 2 mg/ml lysozyme and incubated at room temperature for 10 min.Then 25 ml 0.10 M EDTA (pH 8.0), 6% SDS was added and mixed followed by60 ml of phenol. The mixture was shaken for 40 min. followed bycentrifugation at 10,000×G for 10 min. at room temperature. The upperphase (50 ml) was removed and mixed with 50 ml of phenol:chloroform(50:50 v/v) for 30 min. followed by centrifugation for 10 min. at roomtemperature. The upper phase was decanted and the DNA was precipitatedupon addition of {fraction (1/10)}th volume 3 M sodium acetate (pH 6.5)and 1 volume ethanol. The precipitate was collected by centrifugationand washed twice with 2 ml of 80% ethanol, dried and resuspended in 1 mlT.E. buffer (10 mM Tris Hc1 (pH 7.5), 1 mM EDTA).

[0335] Cloning of dnaX

[0336] DNA oligonucleotides for amplification of T.th. genomic DNA wereas follows. The upstream 32 mer (5′-CGCAAGCTTCACGCSTACCTSTTCTCCGGSAC-3′,S indicating a mixture of G and C) (SEQ. ID. No. 6) consists of a HindIII site within the first 9 nucleotides (underlined) followed by codons(SEQ. ID. No. 29) encoding the following amino acid sequence (HAYLFSGT)(SEQ. ID. No. 7). The downstream 34mer(5′-CGCGAATTCGTGCTCSGGSGGCTCCTCSAGSGTC-3′) (SEQ. ID. No. 8) consistsof an EcoRI site (underlined) followed by codons,(SEQ. ID. No. 30)encoding the sequence KTLEEPPEH (SEQ. ID. No. 9) on the complementarystrand. The amplification reactions contained 10 ng T.th. genomic DNA,0.5 mM of each primer, in a volume of 100 μl of Vent polymerase reactionmixture according to the manufacturers instructions (10 μl ThermoPolBuffer, 0.5 mM each dNTP and 0.5 mM MgSO₄). Amplification was performedusing the following cycling scheme: 5 cycles of: 30 sec. at 95.5° C,. 30sec. at 40° C., 2 min. at 72° C.; 5 cycles of: 30 sec. at 95.5° C., 30sec. at 45° C., and 2 min. at 72° C.; and 30 cycles of: 30 sec. at 95.5°C., 30 sec. at 50° C., and 30 sec. at 72° C. Products were visualized ina 1.5 % native agarose gel.

[0337] Genomic DNA was digested with either XhoI, XbaI, StuI, PstI,NcoI, MluI, KpnI, HindIII, EcoRI, EagI, BglI, or BamHI, followed bySouthern analysis in a native agarose gel (Maniatis et al., 1982).Approximately 0.5 μg of digest was analyzed in each lane of a 0.8%native agarosesgel followed by transfer to an MSI filter (MicronSeparations Inc.). The transfer included the following steps:

[0338] 1.The agarose gel was soaked in 500 ml of 1% HCl with gentleshaking for 10 min.

[0339] 2. Then the gel was soaked in 500 ml of 0.5 M NaOH+1.5 M NaCl for40 min.

[0340] 3. After that the gel was soaked in 500 ml of 1M ammonium acetatefor 1 h.

[0341] 4. The DNA was transferred to the MSI filter with the use ofblotting paper for 4 h.

[0342] 5. The filter was kept at 80° C. for 15 min. in the oven.

[0343] 6. The pre-hybridization step was run in 10 ml of Hybridizationsolution (1% crystalline BSA (fraction V) (Sigma), 1 mM EDTA, 0.5 MNaHPO4 (pH 7.2), 7% SDS) at 65° C. for 30 min.

[0344] 7. The probe, radiolabelled by the random priming method (seebelow), was added to the pre-hybridization solution and kept at 65° C.for 12 h.

[0345] 8. The filter was washed with low stringency with 200 ml of thewash buffer (0.5% BSA, fraction V), 1 mM Na2EDTA, 40 mM NaHPO4 (pH 7.2),5% SDS with gentle shaking for 20 min. This step was repeated 5 times,followed by exposure to X-ray film (XAR-5, Kodak).

[0346] As a probe, the PCR product was radiolabelled by random asfollows.

[0347] 1. 14 ml of the mixture containing 0.2 μg of PCR product DNA, 1μg of the pd(N6) (Promega) and 2.5 ml of the 10× Klenow reaction buffer(100 mM Tris-HCl (pH 7.5), 50 mM MgCl₂, 75 mM dithiothreitol) wereboiled for 10 min. and then kept at 4° C.

[0348] 2. The reaction volume was increased up to 25 μl, containing inaddition 33 μM of each dNTP, except dATP, 10 μCi [(α-³²P) dATP (800Ci/mM), and 2 units of Klenow enzyme. The reaction mixture was incubated1.5 h.

[0349] 3. 2 mg of sonicated herring sperm DNA (GibcoBRL) was added tothe reaction and the volume was increased to 2 ml using hybridizationsolution. The sample was then boiled for 10 min.

[0350] A genomic library of XbaI digested DNA was prepared upon treating1 μg genomic T.th. DNA with 10 units of XbaI in 100 μl of NEBuffer N2(50 mM NaCl, 10 mM Tris-HCl (pH 7.9), 10 mM MgCl2, 1 mM DTT) for 2 h at37° C. The digested DNA was purified by phenol chloroform extraction andethanol precipitation. The Alter-1 vector (0.5 μg)(Promega) was digestedwith 1 unit of XbaI in NEBuffer N2 and then purified byphenol/chloroform extraction and ethanol precipitation. One microgram ofgenomic digest was incubated with 0.05 μg of digested Alter-1 and 20 Uof T4 ligase in 30 μl of ligase buffer (50 mM Tris-HCl (pH 7.8), 10 mMMgCl2, 10 mM DTT and 1 mM ATP) at 15° C. for 12 h. The ligation reactionwas transformed into the DH5α strain of E. coli and transformants wereplated on LB plates containing ampicillin and screened for the dnaXinsert using the radiolabelled PCR probe as follows:

[0351] 1. The colonies tested were lifted onto MSI filters,approximately 100 colonies to each filter.

[0352] 2. The filters, removed from the LB/Tc plates, were placed sideup on a sheet of Whatman 3 MM paper soaked with 0.5 M NaOH for 5 min.

[0353] 3. The filters were transferred to a sheet of paper soaked with 1M Tris-HCl (pH 7.5) for 5 min.

[0354] 4. The filters were placed on a sheet of paper soaked in 0.5 MTris-HCl (pH 7.5), 1.25 M NaCl for 5 min.

[0355] 5. After drying by air, the filters were heated in the oven 80°C. for 15 min. and then were analyzed by Southern hybridization.

[0356] Plasmid DNA was prepared from 20 positive colonies; of these 6contained the expected 4 kb insert when digested with XbaI. Sequencingof the insert was performed by the Sanger method using the Ventpolymerase sequencing kit according to the manufacturers instructions(New England Biolabs).

[0357] Identification of the dnaX Gene

[0358] The dnaX genes of the gram negative E. coli and the gram positiveB. subtilis share more than 50% identity in amino acid sequence withinthe N-termiial 180 residues containing the ATP-binding domain (FIG. 2).Two highly conserved regions (shown in bold in FIG. 2) were used todesign oligonucleotide primers for application of the polymerase chainreaction to T.th. genomic DNA. The expected PCR product, including therestriction sites (i.e. before cutting) is 345 nucleotides. Use of theseprimers with genomic T.th. DNA resulted in a product of the expectedsize. The PCR product was then radiolabelled and used to probe genomicDNA in a Southern analysis (FIG. 3). Genomic DNA was digested withseveral different restriction endonucleases, electrophpresed in a nativeagarose gel and then probed with the PCR fragment. The Southern analysisshowed an XbaI fragment of approximately 4 kb, more than sufficientlength to encode the dnaX gene. Other restriction nucleases producedfragments that were significantly longer, or produced two or morefragments indicating presence of a site within the coding sequence ofdnaX.

[0359] To obtain full length dnaX, genomic DNA was digested with XbaIand ligated into XbaI digested Alter-1 vector. Ligated DNA wastransformed into DH5 alpha cells, and colonies were screened with thelabeled PCR probe. Plasmid DNA was prepared from 20 positive coloniesand analyzed for the appropriate sized insert using XbaI. Six of thetwenty clones contained the expected 4 kb XbaI fragment as an insert,thee sequence of which is shown in FIGS. 4A and 4B.

[0360] The Frameshift Site

[0361] The dnaX gene of E. coli produces two proteins, the γ and τsubunits, by a −1 frameshift (Tsuchihashi and Kornberg, 1990; Flower andMcHenry, 1990; Blinkowa and Walker, 1990). The full length productyields τ, and the frameshift results in addition of one amino acidbefore encountering a stop codon to produce γ. The −1 frameshift site inthe E. coli dnaX gene contains the sequence, A AAA AAG, which followsthe X XXY YYZ rule found in retroviral genes (Jacks et al., 1988). This“slippery sequence” preserves the initial two residues of the tRNAs inthe aminoacyl and peptidyl sites both before and after the frameshift.Mutagenesis of the E. coli dnaX frameshifting site has shown that thefirst three residues can be nucleotides other than A, but that A's inthe second set of three nucleotides is important to frameshifting(Tsuchihashi and Brown, 1992).

[0362] Immediately downstream of the stop codon is a potential stem-loopstructure which enhances frameshifting, presumably by causing theribosome to pause. Further, the AAG codon lacks a cognate tRNA in E.coli and thus the G residue may facilitate the pause, and has been shownto aid the vigorous frameshifting observed in the E. coli dnaX gene(Tsuchihashi and Brown, 1992). A fourth component of frameshifting inthe E. coli dnaX gene is presence of an upstream Shine-Dalgarno sequencewhich is thought to pair with the 16S rRNA to increase the frequency offrameshifting still further (Larsen et al., 1994).

[0363] Examination of the T.th. dnaX sequence reveals a single site thatfulfills the X XXY YYZ rule in which positions 4-7 are A residues. Thesite is unique from that in E. coli as all seven residues are A, and theheptanucleotide sequence is flanked by another A residue on each side(i.e. A9). Surprisingly, the stop codon immediately downstream of thissite is in the −2 frame, although there is a stop codon in the −1 frame28 nucleotides downstream of the −2 stop codon. Indeed, a −2 frameshiftwould fulfill the requirement that the first two nucleotides of eachcodon in the peptidyl and aminoacyl sites be conserved during either a−1 or a −2 frameshift. As with the case of E. coli dnaX, there aresecondary structure step loop structures immediately downstream.Finally, there is a Shine-Dalgarno sequence immediately adjacent to theframeshift site, as well as another-Shine-Dalgarno sequence 22nucleotides upstream of the frameshift site.

[0364] Assuming the first stop codon is utilized (i.e. −2 frameshift),the predicted size of the γ subunit in T.th. is 454 amino acids for amass of 49.8 kDa, over 2 kDa larger than the 431 residue γ subunit (47.5kDa) of E. coli. This would result in 2 residues after the −2 frameshift(i.e. after the GluLysLys, the residues LysAla would be added) to becompared to the result of the −1 frameshift in E. coli which alsoresults in 2 residues (LysGlu). In the event that a −1 frameshift wereutilized in the T.th. dnaX gene, then an additional 12 residues would beadded following the frameshift for a molecular mass of 50.8 kDa (i.e.after the GluLysLys, the residues LysProAspProLysAlaProProGlyPrpThrSerwould be added at aa 453-464 of SEQ. ID. No. 4). As explained later,this nucleotide -sequence was found to promote both −1 and −2frameshifting in E. coli (FIG. 8). But first, we examined T.th. cells byWestern analysis for the presence of two subunits homologous to E. coliγ and τ.

EXAMPLE 2

[0365] Frameshifting Analysis of the T.th. dnaX Gene

[0366] Frameshifting was analyzed by inserting the frameshift site intolacZ in the three different reading frames, followed by plating on X-galand scoring for blue or white colony formation (Weiss et al., 1987). Theframeshifting region-within T.th dnaX was subcloned into the EcoRI/BamHIsites of pUC19. These sites are within the polylinker inside of theβ-galactosidase gene. Three constructs were produced such that theinsert was either in frame with the downstream coding sequence ofβ-galactosidase, or were out of frame (either −1 or −2). An additionalthree constructs were designed by mutating the frameshift sequence andthen placing this insert into the three reading frames of theβ-galactosidase gene. These six plasmids were constructed as describedbelow.

[0367] The upstream primer for the shifty sequences was 5′-gcg cgg atccgg agg gag aaa aaa aaa gcc tca gcc ca-3′ (SEQ. ID. No. 10). The BamHIsite for cloning into pUC is underlined. Also, the stop codon, tga, hasbeen mutated to tca (also underlined). The upstream primer for themutant shifty sequence was: 5′-gcg cgg atc cgg agg gag aga aga aaa gcctca gcc ca-3′ (SEQ. ID. No. 11). The mutant sequence contains twosubstitutions of a G for an A residue in the polyA stretch (underlined).Three downstream primers were utilized with each upstream primer tocreate- two sets of three inserts in the 0 frame, −1 frame and −2 frame.The sequence of these primers, and the length of insert (after cuttingwith EcoRI and BanHI and inserting into pUC19) are as follows: 5′-gaatta aat tcg cgc ttc ggg agg tgg g-3′ (0 frameshift, total 58 nucleotideinsert) (SEQ. ID. No. 12); 5′-gcg cga att cgc gct tcg gga ggt ggg-3′ (−1frame, 54 mer insert) (SEQ. ID. No. 13); and 5′-gcg cga att cgg gcg cttcag gag gtg gg-3′ (−2 frame, 56 mer insert) (SEQ. ID. No. 14). Thedownstream primers have an EcoRI site (underlined); the EcoRI site ofthe 0 frame insert was blunt ended to produce the greater length insert(converting the EcoRI site to an aattaatt sequence). Also, the tcgsequence, which produces the tga stop codon (underlined) was mutated totca in the −2 downstream primer so that readthrough would be allowedafter the frameshift occurred.

[0368] In summary, a region surrounding the frameshift site and endingat least 5 nucleotides past the −1 frameshift stop codon was insertedinto the β-galactosidase gene of pUC19 in the three different readingframes (stop codons were mutated to prevent stoppage following aframeshift). These three plasmids were introduced into E. coli andplated with X-gal. The results, in FIG. 8, show that blue colonies wereobserved after 24 h incubation with all three plasmids and thereforeboth −1 and −2 frameshifting had occurred.

[0369] To further these results, two γ residues were introduced into thepolyA tract which should disrupt the ability of this sequence to directframeshifts. The mutated slippery sequence was inserted into pUC19followed by transformation into E. coli and plating on X-gal. Theresults showed that both −1 and −2 frameshifting was prevented, furthersupporting the fact that frameshifting requires the polyA tract asexpected (FIG. 8).

EXAMPLE 3

[0370] Expression Vector for T.th. γ and τ

[0371] The dnaX gene was cloned into the pET16 expression vector in thesteps shown in FIG. 9. First, the bulk of the gene was cloned into pET16by removing the PmlI/XbaI fragment from pAlterdnaX, and placing it intoSmaI/XbaI digested Puc19 to yield Puc19dnaXCterm. The N-terminalsequence of the dnaX gene was then reconstructed to position an NdeIsite at the N-terminus. This was performed by amplifying the 5′ regionencoding the N-terminal section of γ/τ using an upstream primercontaining an NdeI site that hybridizes to the dnaX gene at theinitiating gtg codon (i.e. to encode Met where the Met is created by thePCR primer, and the Val is the initiating gtg start codon of dnaX). Theprimer sequence for this 5′ end was: 5′-gtggtgcatatz gtg agc gcc ctc taccgc c-3′ (SEQ. ID. No. 15) (where the NdeI site is underlined, and thecoding sequence of dnaX follows). The downstream primer hybridizes pastthe PmlI site at nucleotide positions 987-1004 downstream of theinitiating gtg (primer sequence: 5′-gtggtggtcgac cca gga ggg cca cct ccag-3′ (SEQ. ID. No. 16) where the initial 12 nucleotides contain a SalGIrestriction site, followed by the sequence from the region downstreamthe stop codon). The 1.1 kb nucleotide PCR product was digested withPmlI/NdeI and the PmlI/NdeI fragment was ligated intb NdeI/PmlI digestedPuc19dnaXCterm to form Puc19dnaX. The Puc19dnaX plasmid was thendigested with NdeI and SalI and the 1.9 kb fragment containing the dnaXgene was purified using the Sephaglas BandPrep Kit (Pharmacia-LKB).pET16b was digested with NdeI and XhoI. Then the full length dnaX genewas ligated into the digested pET16b to form pET dnaX.

EXAMPLE 4

[0372] Expression of T.th. γ and τ

[0373] As discussed in the previous example, the dnaX gene wasengineered into the T7 based IPTG inducible pET16 vector such that theinitiation codon was placed precisely following the Met residueN-terminal leader sequence (FIG. 9). This should produce a proteincontaining the entire sequence of γ and τ, along with a 21 residueleader containing 10 contiguous His residues (tagged-IC 60.6 kDa;tagged-γ=52.4 kDa for −2 frameshift). The pET dnaX plasmid wasintroduced into BL21(DE3)pLysS cells harboring the gene encoding T7 RNApolymerase under control of the lac repressor. Log phase cells wereinduced with IPTG and analyzed. before and after induction in an. SDSpolyacrylamide gel (FIG. 10, lanes 1 and 2). The result shows that uponinduction, two new proteins are expressed with the approximate sizesexpected of the T.th. γ and τ subunits (larger than E. coli γ, andsmaller than E. coli τ). The two proteins are produced in nearly equalamounts, similar to the case of the E. coli γ and τ subunits. Westernanalysis using antibodies against the E. coli γ and τ subunitscross-reacted with the induced proteins further supporting theiridentity as T.th. γ and τ (data not shown, but repeated with the puresubunits shown in FIG. 10, lane 6).

EXAMPLE 5

[0374] Purification of T.th. γ and τ

[0375] The His-tagged T.th. γ and τ proteins were purified from 6 L ofinduced E. coli cells containing the pET dnaX plasmid. Cells were lysed,clarified from cell debris by centrifugation and the supernatant wasapplied to a HiTrap chelate affinity column. Elution of the chelateaffinity column yielded approximately 35 mg of protein in which the twopredominant bands migrated in a region consistent with the molecularweight predicted from the dnaX gene (FIG. 10, lane 3), and produced apositive signal by Western analysis using polyclonal antibody directedagainst the E. coli γ and τ subunits (lane 4). The γ and τ subunits arepresent in nearly equal amounts consistent with the nearly equalexpression of these proteins in E. coli cells harboring the pET dnaXplasmid.

[0376] The γ and τ subunits were further purified by gel filtration on aSuperose 12 column (FIG. 10, lane 4; FIG. 11). Recovery of T.th. γ and τsubunits through gel filtration was 81%. The E. coli γ and τ subunits,when separated from one another, elute during gel filtration astetramers. A mixture of E. coli γ/τ results in a mixed tetramer of γ2τ2along with γ4 and τ4 tetramers (Onrust et al., 1995). The mixture ofT.th. γ/τ elutes ahead of the 150 kDa marker, and thus is consistentwith the expected mass of a γ2τ2 tetramer (225 kDa) and γ4 and τ4tetramers.

[0377] As described earlier, the dnaX frameshifting sequence couldproduce either a −1 or −2 frameshift to yield a His-tagged γ subunit ofmass either 53.3 kDa or 52.4 kDa, respectively. The difference in thesetwo possible products is too close to determine from migration in SDSgels. It also remains possible that two γ products are present and donot resolve under the conditions used. The exact protocol for thispurification is described below.

[0378] Six liters of BL21((DE3)pLysSpET dnaX cells were grown in LBmedia containing 50 μg/ml ampicillin and 25 μg/ml chloramphenicol at 37°C. to an O.D. of 0.8 and then IPTG was added to a concentration of 2 mM.After a further 2 h at 37° C., cells were harvested by centrifugationand stored at −70° C. The following steps were performed at 4° C. Cells(15 g wet weight) were thawed and resuspended in 45 ml 1× binding buffer(5 mM imidizole, 0.5 M NaCl, 20 mM Tris HCl (final pH 7.5)) using adounce homogenizer to complete cell lysis and 450 ml of 5% polyamine P(Sigma) was added. Cell debris was removed by centrifugation at 18,000rpm for 30 min. in a Sorvall SS24 rotor at 4° C. The supernatant(Fraction I, 40 ml, 376 mg protein) was applied to a 5 ml HiTrapChelating Separose column (Pharmacia-LKB). The column was washed with 25ml of binding buffer, then with 30 ml of binding buffer containing 60 mMimidizole, and then eluted with 30 ml of 0.5 M imidizole, 0.5 M NaCl, 20mM Tris-HCl (pH 7.5). Fractions of 1 ml were collected and analyzed onan 8% Coomassie Blue stained SDS polyacrylamide gel. Fractionscontaining subunits migrating at the T.th γ and τ positions, andexhibiting cross reactivity with antibody to E. coli γ and τ in aWestern analysis, were pooled and dialyzed against buffer A (20 mMTris-HCl (pH 7.5), 0.1 mM EDTA, 5 mM DTT and 10% glycerol) containing0.5 M NaCl (Fraction II, 36 mg in 7 ml). Fraction II was diluted 2-foldwith buffer A and passed through a 2 ml ATP agarose column equilibratedin buffer A containing 0.2 M NaCl to remove any E. coli γ complexcontaminant. Then 0.18 mg (300 ml) Fraction II was gel filtered on a 24ml Superose 12 column (Pharmacia-LKB) in buffer A containing 0.5 M NaCl.After the first 216 drops, fractions of 200 μl were collected (FractionIII) and analyzed by Western analysis (by procedures similar to thosedescribed in Example 6), by ATPase assays and by Coomassie Blue stainingof an 8% Coomassie Blue stained SDS polyacrylamide gel. The Coomassiestained gels and Western analysis of recombinant T.th. gamma and tau forthese purification steps are summarized in FIG. 10.

EXAMPLE 6

[0379] Western Analysis of T.th. Cells for Presence of γ and τ Subunits

[0380] Polyclonal antibody to E. coli γ/τ-E. coli γ subunit was preparedas described (Studwell-Vaughan and O'Donnell, 1991). Pure γ subunit (100μg) was brought up in Freund's adjuvant and injected subcutaneously intoa New Zealand Rabbit (Poccono Rabbit Farms). After two weeks; a boosterconsisting of 50 μg γ in Freund's adjuvant was administered, followedafter two weeks by a third injection (50 μg).

[0381] The homology between the amino terminal regions of T.th. and E.coli γ/τ subunits suggested that there may be some epitopes in commonbetween them. Hence, polyclonal antibody directed against the E. coliγ/τ subunits was raised in rabbits for use in probing T.th. cells byWestern analysis. FIG. 7 shows the results of a Western analysis ofwhole T.th. cells lysed in SDS. The results show that in T.th. cells,the antibody is rather specific for two high molecular proteins whichmigrate in the vicinity of the molecular masses of E. coli γ and τsubunits.

[0382] Procedure for Western Analysis

[0383] Samples were analyzed in duplicate 10% SDS polyacrylamide gels bythe Western method (Towbin et al. 1979). One gel was Coomassie stainedto evaluate the pattern of proteins present, and the other gel was thenelectroblotted onto a nitrocellulose membrane (Schleicher and Schuell).For molecular size markers, the kaliedoscope molecular weight markers(Bio-Rad) were used to verify by visualization that transfer of proteinsonto the blotted membrane had occurred. The gel used in electroblottingwas also stained after electroblotting to confirm that. efficienttransfer of protein had occured. Membranes were blocked using 5% non-fatmilk, washed with 0.05% Tween in TBS (TBS-T) and then incubated for over1 h with a 1/5000 dilution of rabbit polyclonal antibody directedagainst E. coli γ and τ in 1% gelatin in TBS-T at room temperature.Membranes were washed using TBS-T buffer and then antibody was detectedon X-ray film (Kodak) by using the ECL kit from (Amersham) and themanufactures recommended procedures.

[0384] Samples included: 1) a mixture of E. coli γ (15 ng) and τ (15 ng)subunits; 2) T.th. whole cells (100 μl) suspended in cracking buffer;and 3) purified T.th. γ and τ fraction II (0.6 μg as a mixture).

EXAMPLE 7

[0385] Characterization of the ATPase Activity of γ/τ

[0386] The E. coli τ subunit is a DNA dependent ATPase (Lee and Walker,1987; Tsuchihashi and Kornberg, 1989). The γ subunit binds ATP but doesnot hydrolyze it even in the presence of DNA unless other subunits ofthe DNA polymerase III holoenzyme are also present (Onrust et al.,1991). Next we examined the T.th. γ/τ subunits for DNA dependent ATPaseactivity. The γ/τ preparation was, in fact, a DNA stimulated ATPase(FIG. 11, top panel). The specific activity of the T.th. γ/τ was 11.5mol ATP hydrolyzed/mol γ/τ (as monomer and assuming an equal mixture ofthe two). Furthermore, analysis of the gel filtration column fractionsshows that the ATPase activity coelutes with the T.th. γ/τ subunits,supporting evidence that the weak ATPase activity is intrinsic to theγ/τ subunits (FIG. 11). The specific activity of the γ/τ preparationbefore gel filtration was the same as after gel filtration (within 10%),further indicating that the DNA stimulated ATPase is an inherentactivity of the γ/τ subunits. Presumably, only the τ subunit containsATPase activity, as in the case of E. coli. Assuming only T.th. τcontains ATPase activity, its specific activity is twice the observedrate (after factoring out the weight of γ). This rate is still onlyone-fifth that of E. coli τ.

[0387] The T.th. γ/τ ATPase activity is lower at 37° C. than at 65° C.(middle panel), consistent with the expected behavior of proteinactivity from a thermophilic source. However, there is no apparentincrease in activity in proceeding from 50° C. to 65° C. (the rapidbreakdown of ATP above 65° C. precluded measurement of ATPase activityat temperatures above 65° C.). In contrast, the E. coli τ subunit lostmost of its ATPase activity upon elevating the temperature to 50° C.(middle panel). These reactions contain no stabilizers such as anonionic detergent or gelatin, nor did they include substrates such asATP, DNA or magnesium.

[0388] Last, the relative stability of T.th. γ/τ and E. coli γ/τ toaddition of NaCl (FIG. 12., bottom panel) was examined. Whereas the E.coli τ subunit rapidly lost activity at even 0.2 M NaCl, the T.th. γ/τretained full activity in 1.0 M NaCl and was still 80% active in 1.5 MNaCl. The detailed procedure for the ATPase activity assay is describedbelow.

[0389] ATPase Assays

[0390] ATPase assays were performed in 20 μl of 20 mM Tris-HCl (pH 7.5),8 mM MgCl₂ containing 0.72 μg of M13mp18 ssDNA (where indicated), 100 mM[γ-³²P]-ATP (specific activity of 2000-4000 cpm/pmol), and the indicatedprotein. Some reactions contained additional NaCl where indicated.Reactions were incubated at the temperatures indicated in the figurelegends for 30 min. and then were quenched with an equal volume of 25 mMEDTA (final). The aliquots were analyzed by spotting them (1 μl each)onto thin layer chromatography (TLC) sheets coated with Cel-300polyethyleneimine (Brinkmann Instruments Co.). TLC sheets were developedin 0.5 M lithium chloride, 1 M formic acid. An autoradiogram of the TLCchromatogram was used to visualize Pi at the solvent front and ATP nearthe origin which were then cut from the TLC sheet and quantitated byliquid scintillation. The extent of ATP hydrolyzed was used to calculatethe mol of Pi released per mol of protein per min. One mol of E. coli τwas calculated assuming a mass of 71 kDa per monomer. The T.th. γ and τpreparation was treated as an equal mixture and thus one mole of proteinas monomer was the average of the predicted masses of the γ and τsubunits (54 kDa).

EXAMPLE 8

[0391] Homolog of T.th. γ/τ to dnaX Gene Products of Other Organism

[0392] The XbaI insert encoded an open reading frame, starting with aGTG codon, of 529 amino acids in length (58.0 kDa), closer to thepredicted length of the B. subtilis τ subunit (563 amino acids, 62.7 kDamass)(Alonso et al., 1990) than the E. coli τ subunit (71.1 kDa)(Yin etal., 1986). The dnaX gene encoding the γ/τ subunits of E. coli DNApolymerase III holoenzyme is homologous to the holB gene encoding the δ′subunit of the γ complex clamp loader, and this homology extends to all5 subunits of the eukaryotic RFC clamp loader as well as thebacteriophage gene protein 44 of the gp44/62 clamp loading complex(O'Donnell et al., 1993). These gene products show greatest homologyover the N-terminal 166 amino acid residues (of E. coli dnaX); theC-terminal regions are more divergent. FIG. 4 shows an alignment of theamino acid sequence of the N-terminal regions of the T.th. dnaX geneproduct to those of several other bacteria. The consensus GXXGXGKT (SEQ.ID. No. 17) motif for nucleotide binding is conserved in all theseprotein products. Further, the E. coli δ′ crystal structure reveals oneatom of zinc coordinated to four Cys residues (Guenther, 1996). Thesefour Cys residues are conserved in the E. coli dnaX gene, and the γ andτ subunits encoded by E. coli dnaX bind one atom of zinc. These Cysresidues are also conserved in T.th. dnaX (shown in FIG. 4). Overall,the level of amino acid identity relative to E. coli dnaX in theN-terminal 165 residues of T.th. dnaX is 53%. The T.th. dnaX gene isjust as homologous to the B. subtilis dnaX (53% identity) gene relativeto E. coli dnaX. After this region of homology, the C-terminal region ofT.th. dnaX shares 26% and 20% identity to E. coli and B. subtilis dnaX,respectively. A proline rich region, downstream of the conserved region,is also present in T.th. dnaX (residues 346-375), but not in the B.subtilis dnaX (see FIGS. 3A and 3B). The overall identity between E.coli dnaX and T.th. dnaX over the entire gene is 34%. Identity of T.th.dnaX to B. subtilis dnaX over the entire gene is 28%.

[0393] Comparison of dnaX genes from T.th. and E. coli

[0394] The above identifies a homologue of the dnaX gene of E. coli inThermus thermophilus. Like the E. coli gene, T.th. dnaX encodes tworelated proteins through use of a highly efficient translationalframeshift. The T.th. γ/τ subunits are tetramers, or mixed tetramers,similar to the γ and τ subunits of E. coli. Further, the γ/τ subunit isa DNA stimulated ATPase like its E. coli counterpart. As expected forproteins from a thermophile, the T.th. γ/τ ATPase activity isthermostabile and resistant to added salt.

[0395] In E. coli, γ is a component of the clamp loader, and the τsubunit serves the function of holding the clamp loading apparatustogether with two DNA polymerases for coordinated replication of duplexDNA. The presence of γ in T.th. suggests it has a clamp loadingapparatus and thus a clamp as well. The presence of the τ subunit ofT.th. implies that T.th. contains a replicative polymerase with astructure similar to that of E. coli DNA polymerase III holoenzyme.

[0396] A significant difference between E. coli and T.th. dnaX genes isin the translational frameshift sequence. In E. coli the heptamerframeshift site contains six A residues followed by a G residue in thecontext A AAA AAG. This sequence satisfies the X XXY YYZ rule for −1frameshifting. The frameshift is made more efficient by the absence ofthe AAG tRNA for Lys which presumably leads to stalling of the ribosomeat the frameshift site and increases the efficiency of frameshifting(Tsuchihashi and Brown, 1992). Two additional aids to frameshiftinginclude a downstream hairpin and an upstream Shine-Dalgarno sequence.(Tsuchihashi and Kornberg, 1990; Larsen et al., 1994). The −1 frameshiftleads to incorporation of one unique residue at the C-terminus of E.coli γ before encounter with a stop codon.

[0397] In T.th., the dnaX frameshifting heptamer is A AAA AAA, and it isflanked by two other A residues, one on each side. There is also adownstream region of secondary structure. The nearest downstream stopcodon is positioned such that gamma would contain only one unique aminoacid, as in E. coli. However, the T.th. stop codon is in the −2 readingframe thus requires a −2 frameshift. No precedent exists in nature for−2 frameshifting, although −2 frameshifting has been shown to occur intest cases (Weiss et al., 1987). In vivo analysis of the T.th.frameshift sequence shows that this natural sequence promotes both −1and −2 frameshifting in E. coli. Whereas the −2 frameshift results inonly one unique C-terminal residue; a −1 frameshift would result in anextension of 12 C-terminal residues. At present, the results do notdiscriminate which path occurs in T.th., a −1 or −2 frameshift, or acombination of the two.

[0398] There are two Shine-Dalgarno sequences just upstream of theframeshift site in T.th. dnaX. In two cases of frameshifting in E. coli,an upstream Shine-Dalgarno sequence has been shown to stimulateframeshifting (reviewed in Weiss et al., 1897). In release factor 2(RF2), the Shine-Dalgarno is 3 nucleotides upstream of the shift site,and it stimulates a +1 frameshift event. In the case of E. coli dnaX; aShine-Dalgarno sequence 10 nucleotides upstream of the shift sequencestimulates the −1 frameshift. One of the T.th. dnaX Shine-Dalgarnosequences is immediately adjacent to the frameshift sequence with noextra space, the other is 22 residues upstream of the frameshift site.Which of these Shine-Dalgarno sequences plays a role in T.th. dnaXframeshifting, if any, will require future study.

[0399] In E. coli, efficient separation of the two polypeptides, γ andτ, is achieved by mutation of the frameshift site such that only onepolypeptide is produced from the gene (Tsuchihashi and Kornberg, 1990).Substitution of G-to-A in two positions of the heptamer of T.th. dnaXeliminates frameshifting and thus should be a source to obtain τ subunitfree of γ. To produce pure γ subunit free of τ, the frameshifting siteand sequence immediately downstream of it can be substituted for anin-frame sequence with a stop codon.

[0400] Examination of the B. subtilis dnaX gene shows no frameshiftsequence that satisfies the X XXY YYZ rule. Hence, it would appear thatdnaX does not make two proteins in this gram positive organism.

[0401] Rapid thermal motions associated with high temperature may makecoordination of complicated processes more difficult. It seems possiblethat organizing the components of the replication apparatus may becomeyet more important at higher temperature. Hence, production of a τsubunit that could be used to crosslink two polymerases and a clamploader into one organized particle may be most useful at elevatedtemperature.

[0402] As stated above, the following examples describe the continuedisolation and purification of the substantial entirety of the PolymeraseIII from the extreme thermophile Thermus thermophilus. It is to beunderstood that the following exposition is reflective of the protocoland characteristics, both morphological and functional, of thePolymerase III-type enzymes that are the focus of the present invention,and that the invention is hereby illustrated and comprehends the entireclass of enzymes of thermophilic origin.

EXAMPLE 9

[0403] Purification of the Thermus thermophilus DNA Polymerase III

[0404] All steps in the purification, assay were performed at 4° C. Thefollowing assay was used in the purification of DNA polymerase fromT.th. cell extracts. Assays contained 2.5 mg activated calf thymus DNA(Sigma Chemical Company) in a final volume of 25 ml of 20 mM Tris-Cl (pH7.5), 8 mM MgCl₂, 5 mM DTT, 0.5 mM EDTA, 40 mg/ml BSA, 4% glycerol, 0.5mM ATP, 3 mM each dCTP, dGTP, dATP, and 20 mM [α-³²P]dTTP. An aliquot ofthe fraction to be assayed was added to the assay mixture on icefollowed by incubation at 60° C. for 5 min. DNA synthesis wasquantitated using DE81 paper followed by washing off unincorporatednucleotide. Incorporated nucleotide was determined by scintillationcounting of the filters.

[0405]Thermus thermophilus cell extracts were prepared by suspending 35grams of cell paste in 200 ml of 50 mM TRIS-HCl, pH=7.5, 3.0 mMspermidine, 100 mM NaCl, 0.5 mM EDTA, 5 mM DTT, 5% glycerol, followed bydisruption by passage through a French pressure cell (15,000 PSI). Celldebris was removed by centrifugation (12,000 RPM, 60 min). DNApolymerase III in the clarified supernatant was precipitated bytreatment with ammonium sulphate (0.226 gm/liter) and recovered bycentrifugation. This fraction was then backwashed with the same buffer(but lacking spermidine) containing 0.20 gm/l ammonium sulfate. Thepellet was then resuspended in buffer A and dialyzed overnight against 2liters of buffer A; a precipitate which formed during dialysis wasremoved by centrifugation (17,000 RPM, 20 min).

[0406] The clarified dialysis supernatant, containing approximately 336mg of protein, was applied onto a 60 ml heparin agarose columnequilibrated in buffer A which was washed with the same buffer untilA280 reached baseline. The column was developed with a 500 ml lineargradient of buffer A from 0 to 500 mM NaCl. More tightly adheredproteins were washed off the column by treatment with buffer A (20 mMTris Hcl, pH=7.5, 0.1 mM EDTA, 5 mM DTT, and 10% glycerol) and 1M NaCl.Some DNA polymerase activity flowed through the column. Two peaks(HEP.P1 and HEP.P2) of DNA polymerase activity eluted from the heparnagarose column containing 20 mg and 2 mg of total protein respectively(FIG. 13A). These were kept separate throughout the remainder of thepurification protocol.

[0407] The Pol III resided in HEP.P1 as indicated by the followingcriteria: 1) Western, analysis using antibody directed against the αsubunit of E. coli Pol III indicated presence of Pol III in HEP.P1;2).Onlythe HEP.P1 fraction was capable of extending a single primeraround an M13mp18 7.2 kb ssDNA circle (explained later in Example 16),such long primer extension being a characteristic of Pol III typeenzymes; and 3) Only the HEP.P1 provided DNA polymerase activity thatwas retained on an ATP-agarose affinity column, which is indicative of aPol III-type DNA polymerase since the γ and τ subunits are ATPinteractive proteins.

[0408] The first peak of the heparin agarose column (HEP.P1.: 20 mg in127.5 ml) was dialyzed against buffer A and applied onto a 2 mlN6-linkage ATP agarose column pre-equilibrated in the same buffer. Boundprotein was eluted by a slow (0.05 ml/min) wash with buffer A+2M NaCland collected into 200 μl fractions. Chromatography of peak HEP.P1yielded a flow-through (HEP.P1-ATP-FT) and a bound fraction(HEP.P1-ATP-Bound) (FIG. 13B). Binding of peak HEP.P2 to the ATP columncould not be detected, though DNA polymerase activity was recovered inthe flow-through.

[0409] The HEP.P1-ATP-Bound fractions from the ATP agarosechromatographic step were further purified by anion exchange over monoQ.The HEP.P1-ATP-Bound fractions were diluted with buffer A toapproximately the conductivity of buffer A plus 25 mM NaCl and appliedto a 1 ml monoQ column equilibrated in Buffer A. DNA polymerase activityeluted in the flow-through and in two resolved chromatographic peaks(MONOQ peak1 and peak2) (FIG. 13C). Peak 2 was by far the major sourceof DNA polymerase activity. Western analysis using rabbit antibodydirected against the E. coli α subunit confirmed presence of the asubunit in the second peak (see the Western analysis in FIG. 14B).Antibody against the E. coli τ subunit also confirmed the presence ofthe τ subunit in the second peak. Some reaction against α and τ was alsopresent in the minor peak (first peak). The Coomassie Blue SDSpolyacrylamide gel of the MonoQ fractions (FIG. 14A) showed a band thatco-migrated with E. coli α and was in the same position as the antibodyreactive material (antibody against E. coli α). Also present are bandscorresponding to τ, γ, δ, and δ′. These subunits, along with β, are allthat is necessary for rapid and processive synthesis and primerextension over a long (>7 kb) stretch of ssDNA in the case of E. coliDNA Polymerase III holoenzyme.

[0410] The Pol III-type enzyme purified from T.th. may be a PolIII*-like enzyme that contains the DNA polymerase and clamp loadersubuits (i.e., like the Pol III* of E. coli). The evidence for thisis: 1) the presence of dnaX and dnaE gene products in the same columnfractions as indicated by Western analysis (see above); 2) the abilityof this enzyme to extend a primer around a 7.2 kb circular ssDNA uponadding only β (see Example 16); 3) stimulation of Pol III by adding β onlinear DNA, indicating β subunit is not present in saturating amounts(see Example 15); and 4) the presence of τ in T.th. which may glue thepolymerase and clamp loader into a Pol III* as in E. coli; and 5) thecomigration of a with subunits τ, γ, δ and δ′ of the clamp loader in thecolumn fractions of the last chromatographic step (MonoQ, FIG. 14A).

[0411] Micro-Sequencing of T.th DNA Polymerase III α Subunit

[0412] The a subunit from the purified T.th DNA polymerase III(HEP.P1.ATP-Bound.MONOQ peak2) was blotted onto PVDF membrane and wascut out of the SDS-PAGE gel and submitted to the Protein-Nucleic AcidFacility at Rockefeller University for N-terminal sequencing andproteolytic digestion, purification and microsequencing of the resultantpeptides. Analysis of the α candidate band (Mw 130 kD) yielded fourpeptides, two of which (TTH1, TTH2) showed sequence similarity to asubunits from various bacterial sources (see FIG. 15).

EXAMPLE 10

[0413] Identification of the Thermus thermophilus dnaE Gene Encoding theα Subunit of DNA Polymerase III Replication Enzyme

[0414] Cloning of the dnaE gene was started with the sequence of theTTH1 peptide from the purified a subunit (FFIEIQNHGLSEQK) (SEQ. ID. No.61). The fragment was aligned to a region at approximately 180 aminoacids downstream of the N-termini of several other known α subunits asshown in FIG. 15. The upstream 33mer(5′-GTGGGATCCGTGGTTCTGGATCTCGATGAAGAA-3′) (SEQ. ID. No. 31) consists ofa BamHI site within the first 9 nucleotides (underlined) and thesequence coding for the following peptide HGLSEQK on the complementarystrand. The downstream 29 mer (5′-GTGGGATCCACGGSCTSTCSGAGCAGAAG-3′)(SEQ. ID. No. 32) consists of a BamHI site within the first 9nucleotides (underlined) and the following sequence coding for thepeptide FFIEIQNH (SEQ. ID. No. 62).

[0415] These two primers were directed away from each other for thepurpose of perfoming inverse PCR (also called circular PCR). Theamplification reactions contained 10 ng T.th. genomic DNA (that had beencut and religated with XmaI), 0.5 mM of each primer, in a volume of 100μl of Vent polymerase reaction mixture containing 10 μl ThermoPolBuffer, 0.5 mM of each dNTP and 0.25 mM MgSO₄. Amplification wasperformed using the following cycling scheme:

[0416] 1.4 cycles of: 95.5° C.—30 sec., 45° C.—30 sec., 75° C.—8 mm.

[0417] 2.6 cycles of: 95.5° C.—30 sec., 50° C.—30 sec., 75° C.—6 min.

[0418] 3.30 cycles of: 95.5° C.—30 sec., 52.5° C.—30 sec., 75° C.—5 min.

[0419] A 1.4 kb fragment was obtained and cloned into pBS-SK:BamHI (i.e.pBS-SK (Stratragene) was cut with BamHI). This sequence was brackettedby the 29 mer primer on both sides and contained the sequence coding forthe N-terminal part of the subunit up to the peptide used for primerdesign.

[0420] To obtain further dnaE gene sequence, the TTH2 peptide was used.It was aligned to a region about 600 amino acids from the N-termini ofthe other known subunits (FIG. 15B).

[0421] The upstream 34 mer (5′-GCGGGATCCTCAACGAGGACCTCTCCATCTTCAA-3′)(SEQ. ID. No. 33) consists of a BamHI site within the first 9nucleotides (underlined) and the sequence from the end of the fragmentpreviously obtained. The downstream 35 mer(5′-GCGGGATCCTTGTCGTCSAGSGTSAGSGCGTCGTA-3′) (SEQ. ID. No. 34) consistsof a BamHI site within the first 9 nucleotides (underlined) and thefollowing sequence coding for the peptide YDALTLDD (SEQ. ID. No. 63) onthe complementary strand. The amplification reactions contained 10 ngT.th. genomic DNA, 0.5 mM of each primer, in a volume of 100 μl of Ventpolymerase reaction mixture containing 10 μl ThermoPol Buffer, 0.5 mM ofeach dNTP and 0.25 mM MgSO₄. Amplification was performed using thefollowing cycling scheme:

[0422] 1.4 cycles of: 95.5° C.—30 sec., 45° C.—30 sec., 75° C.—8 min.

[0423] 2.6 cycles of: 95.5° C.—30 sec., 50° C.—30 sec., 75° C.—6 min.

[0424] 3.30 cycles of: 95.5° C.—30 sec., 55° C.—30 sec., 75° C.—5 min.

[0425] A 1.2 kb PCR fragment was obtained and cloned into pUC19:BamHI.The fragment was bracketted by the downstream primer on both sides andcontained the region overlapping in 56 bp with the fragment previouslycloned.

[0426] To obtain yet more dnaE sequence, the following primers wereused. The upstream 39 mer(3′-GTGTGGATCCTCGTCCCCCTCATGCGCGACCAGGAAGGG-5′) (SEQ. ID. Nos. 35 and114) consists of a BamHI site within the first 10 nucleotides(underlined) and the sequence from the end of the fragment previouslyobtained. The downstream 27 mer (5′-GTGTGGATCCTTCTTCTTSCCCATSGC-3′)(SEQ. ID. No. 36) consists of a BamHI site within the first 10nucleotides (underlined), and the sequence coding for the peptide AMGKKK(SEQ. ID. No. 64) (at position approximately 800 residues from the Nterminus) on the complementary strand. The AMGKKK (SEQ. ID. No. 64)sequence was chosen for primer design as it is highly conserved amongthe known gram-negative α subunits. The amplification reactionscontained 10 ng T.th. genomic DNA, 0.5 mM of each primer, in a volume of100 μl of Taq polymerase reaction mixture containing 10 μl PCR Buffer,0.5 mM of each dNTP and 2.5 mM MgCl₂. Amplification was performed usingthe following cycling scheme:

[0427] 1.3 cycles of: 95.5° C.—30 sec., 45° C.—30 sec., 72° C.—8 min.

[0428] 2.6 cycles of: 94.5° C.—30 sec., 55° C.—30 sec., 72° C.—6 min.

[0429] 3.32 cycles of: 94.5° C.—30 sec., 50° C.—30 sec., 72° C.—5 min.

[0430] A 2.3 kb PCR fragment was obtained instead of the expected 0.6 kbfragment. BamHI digestion of the PCR product resulted in three fragmentsof 1.1 kb, 0.7 kb and 0.5 kb. The 1.1 kb fragment was cloned intopUC19:BamHI. It turned out to be the one adjacent to the fragmentpreviously obtained and contained the dnaE sequence right up to theregion coding for the AMGKKK (SEQ. ID. No. 64) peptide, but wasdisrupted by an intion just upstream of this region. The sequence thatfollows this was amplified from the 2.3 kb original PCR product usingthe same conditions and cycling scheme as for the 2.3 kb fragment. Thedownstream primer was the same as in the previous step. The upstream 27mer (3′-GTGTGGATCCGTGGTGACCTTAGCCAC-5′) (SEQ. ID. Nos. 37 and 115)consisted of a BamHI site within the first 9 nucleotides (underlined)and the sequence from the end of the 1.1 kb fragment previouslydescribed.

[0431] The expected 1.2 kb PCR fragment was obtained and cloned intopUC19:SmaI. This fragment coded for the rest of the intein and the endof it was used to obtain the next sequence of dnaE downstream of thisregion. The upstream 30 mer (3′-TTCGTGTCCGAGGACCTTGTGGTCCACAAC-5′) (SEQ.ID. Nos. 38 and 116) was a sequence from the end of the intron. Thedownstream 23mer (5′-CCAGAATCGTCTGCTGGTCGTAG-3′) (SEQ. ID. No. 39) wasthe sequence from the end of the dnaE gene of D. rad. (coding on thecomplementary strand for the region slightly homologous in the distantlyrelated a subunits and possibly highly homologous between T.th. and D.rad. α subunits). The amplification reactions contained 10 ng T.th.genomic DNA, 0.5 mM of each primer, in a volume of 100 μl of Ventpolymerase reaction mixture containing 10 μl ThermoPol Buffer, 0.5 mM ofeach dNTP and 0.1 mM MgSO₄. Amplification was performed using thefollowing cycling scheme:

[0432] 1. 3 cycles of: 95.5° C.—30 sec., 55° C.—30 sec., 75° C.—8 min.

[0433] 2.32 cycles of: 94.5° C.—30 sec., 50° C.—30 sec., 75° C.—5 min.

[0434] A 2.5 kb PCR fragment was obtained and cloned into pUC19:SmaI.This fragment contained the dnaE sequence coding for the 300 mino acidsnext to the AMGKKK (SEQ. ID. No. 64) region disrupted by yet a secondintein inside another sequence that is conserved among the known osubunits (FNKSHSAAY) (SEQ. ID. No. 65).

[0435] To obtain the rest of the dnaE gene the upstream 19 mer(5′-AGCACCCTGGAGGAGCTTC-3′) (SEQ. ID. No. 40) from the end of the knowndnaE sequence was used. The downstream primer was:5′-CATGTCGTACTGGGTGTAC-3′ (SEQ. ID. No. 41). The amplification reactionscontained 10 ng T.th. genomic DNA, 0.5 mM of each primer, in a volume of100 μl of Vent polymerase reaction mixture containing 10 μl ThermoPolBuffer, 0.5 mM of each dNTP and 0.1 MM MgSO₄. Amplification wasperformed using the following cycling scheme:

[0436] 1.3 cycles of: 95.5° C.—30 sec., 55° C.—30 sec., 75° C.—8 min.

[0437] 2.32 cycles of: 94.5° C.—30 sec., 50° C.—30 sec., 75° C.—5 min.

[0438] A 1.0 kb fragment bracketed by this upstream primer was obtained.It contained the 3′ end of the dnaE gene.

EXAMPLE 11

[0439] Cloning and Expression of the Thermus thermophilus dnaQ GeneEncoding the ε Subunit of DNA Polymerase III Replication Enzyme

[0440] Cloning of dnaQ

[0441] The dnaQ gene of E. coli and the corresponding region of PolC ofB. subtilis, evolutionary divergent organisms, share approximately 30%identity. Comparison of the predicted amino acid sequences for DnaQ (C)of E. coli and PolC of B. subtilis revealed two highly conserved regions(FIG. 17). Within each of these regions, a nine amino acid sequence wasused to design two oligonucleotide primers for use in the polymerasechain reaction.

[0442] The regions highly conservative among Pol III exonucleases werechosen to design the degenerate primers for the amplification of a T.th.dntQ internal fragment (see FIG. 17). DNA oligonucleotides foramplification of T.th. genomic DNA were as follows. The upstream 27 mer(5′-GTSGTSNNSGACNNSGAGACSACSGGG-3′ (SEQ. ID. No. 42)) encodes thefollowing sequence (VVXDXETTG) (SEQ. ID. No. 66). The downstream 27 mer(5′-GAASCCSNNGTCGAASNNGGCGTTGTG-3′) (SEQ. ID. No. 43) encodes thesequence HNAXFDXGF (SEQ. ID. No. 67) on, the complementary strand. Theamplification reactions contained 10 ng T.th. genomic DNA, 0.5 mM ofeach primer, in a volume of 100 μl of Vent polymerase reaction mixturecontaining 10 μl ThermoPol Buffer, 0.5 mM of each dNTP and 0.5 mM MgSO₄.Amplification was performed using the following cycling scheme:

[0443] 1.5 cycles of: 95.5° C.—30 sec., 40° C.—30 sec., 72° C.—2 min.

[0444] 2.5 cycles of: 95.5° C.—30 sec., 45° C.—30 sec., 72° C.—2 min.

[0445] 3.30 cycles of: 95.5° C.—30 sec., 50° C.—30 sec., 72° C.—30 min.

[0446] Products were visualized inca 1.5% native agarose gel. A fragmentof the expected size of 270 bp was cloned into the SmaI site of pUC19and sequenced with the CircumVent Thermal Cycle DNA sequencing kitaccordinig to the manufacturer's instructions (New England Biolabs).

[0447] To obtain further sequence of the dnaQ gene, genomic DNA wasdigested with either mhoI, BamHI, KpnI or NcoI. These restrictionenzymes were chosen because they cut T.th. genomic DNA frequently.Approximately 0.1 μg of DNA for each digest was ligated by T4 DNA ligasein 50 μl of ligation buffer (50 mM Tris-HCl (pH 7.8), 10 mM MgCl₂, 10 mMdithiothreitol, 1 mM ATP, 25 mg/ml bovine serum albumin) overnight at20° C. The ligation mixtures were used for cicular PCR.

[0448] DNA oligonucleotides for amplification of T.th. genomic DNA werethe following. The upstream 27 mer (5′-CGGGGATCCACCTCAATCACCTCGTGG-3′)(SEQ. ID. No. 44) consists of a BamHI site within the first 9nucleotides (underlined) and the sequence complementary to 42-61 bpregion of the previously cloned dnaQ fragment. The downstream 30 mer(5′-CGGGGATCCGCCACCTTGCGGCTCCGGGTG-3′) (SEQ. ID. No. 45) consists of aBamHI site within the first 9 nucleotides (underlined) and the sequencecorresponding to 240-261 bp region of the dnaQ fragment (see FIG. 17).

[0449] The amplification reactions contained 1 ng T.th. genomic DNA(that had been cut with NcoI and religated into circular DNA forcircular PCR), 0.4 mM of each primer, in a volume of 100 μl of Ventpolymerase reaction mixture containing 10 μl ThermoPol Buffer, 0.5 mM ofeach dNTP, 0.5 mM MgSO₄, and 10% DMSO. Circular amplification wasperformed using the following cycling scheme:

[0450] 1.5 cycles of: 95.5° C.—30 sec., 50° C.—30 sec., 72° C.—8 min.

[0451] 2.35 cycles of: 95.5° C.—30 sec., 55° C.—30 sec., 72° C.—6 min.

[0452] 3.72° C.—10 min.

[0453] A 1.5 kb fragment was obtained and cloned into the BamHI site ofthe pUC19 vector. Partial sequencing of the fragment reveiled that itcontained the dnaQ regions adjacent to sequences corresponding to thePCR primers and hence contained the sequences both upstream anddownstream of the previously cloned dnaQ fragment. One of NcoI sitesturned out to be approximatly 300 bp downstream of the end of the firstcloned dnaQ sequence and hence did not include the 3′ end of dnaQ. Toobtain the 3′ end, another inverse PCR reaction was performed. Since anApaI restiction site was recognized within this newly sequenced dnaQfragment, the circular PCR procedure was performed using as template anApaI digest of T.th. genomic DNA that was ligated (circularized) underthe same conditions as described above.

[0454] DNA oligonucleotides for amplification of the ApaI/religatedT.th. genomic DNA were as follows. The upstream 31 mer(5′-GCGCTCTAGACGAGTTCCCAAAGCGTGCGGT-3′) (SEQ. ID. No. 46) consists of ambaI site within the first 10 nucleotides (underlined) and the sequencecomplementary to the region downstream of the ApaI restriction site inthe newly sequenced dnaQ fragment. The downstream 25 mer(5′-CGCGTCTAGATCACCTGTATCCAGA-3′) (SEQ. ID. No. 47) consists of a XbaIsite within the first 10 nucleotides (underlined) and the sequencecorresponding to another region downstream of the ApaI restriction sitein the newly sequenced dnaQ fragment. The 1.7 kb PCR fragment was clonedinto the XbaI site of the pUC19 vector and partially sequenced. Thesequence of dnaQ, and the protein sequence of the ε subunit encoded byit, is shown in FIG. 18.

[0455] The dnaQ gene is encoded by an open reading frame of 209 (or 190depending on which Val is used as the initiating residue) amino acids inlength (23598.5 kDa—or 21383.8 kDa for shorter version), similar to thelength of the E. coli ε subunit (243 amino acids, 27099.1 kDa mass) (seeFIG. 17).

[0456] The entire amino acid sequence of the e subunit predicted fromthe T.th. dnaQ gene aligns with the predicted amino acid sequence of thednaQ genes of other organisms with only a few gaps and insertions (thefirst two amino acids, and four positions downstream) (FIG. 17). Theconsensus motifs VVXDXETTG (SEQ. ID. Nos. 66 and 68), HNAXFDXGF (SEQ.ID. No. 67), and HRALYD (SEQ. ID. No. 70), characteristic forexonucleases, are conserved. Overall, the level of amino acid identityrelative to most of the known ε subunits, or corresponding proofreadingexonuclease domains of gram positive PolC genes is approximately 30%.Upstream of start 1 (FIG. 17) there were stop codons in all threereading frames.

[0457] Expression of dnaQ

[0458] The dnaQ gene was cloned gene into the pET24-a expression vectorin two steps. First, the PCR fragment encoding the N-terminal part ofthe gene was cloned into the pUC19 plasmid, containing the ApaI inversePCR fragment into NdeI/ApaI sites. DNA oligonucleotides foramplification of T.th. genomic DNA were as follows. The upstream 33mer(5′-GCGGCGCATATGGTGGTGGTCCTGGACCTGGAG-3′) (SEQ. ID. No. 48) consists ofan NdeI site within the first 12 nucleotides (underlined) and thebegining of the dnaQ gene. The downstream 25 mer(5′-CGCGTCTAGATCACCTGTATCCAGA-3′) (SEQ. ID. No. 49), already used forApaI circular PCR, consists of an XbaI site within the first 10nucleotides (underlined) and the sequence corresponding to the regiondownstream of the ApaI restriction site. The 2.2 kb NdeI/SalI fragmentwas then cloned into the NdeI/XhoI sites of the pET16 vector to producepET24-a:dnaQ. The ε subunit was expressed in the BL21 /LysS straintransformed by the pET24-a:dnaQ plasmid.

EXAMPLE 12

[0459] The Thermus thermophilus dnaN Gene Encoding the β Subunit of DNAPolymerase III Replication Enzyme

[0460] Strategy of Cloning dnaN by use of dnaA

[0461] DnaN proteins are highly divergent in bacteria making itdifficult to clone them by homology. The level of identity between DnaNrepresentatives from E. coli and B. subtilis is as low as 18%. These 18%of identical amino acid residues are dispersed through the proteinsrather then clustering together in conservative regions, furthercomplicating use of homology to design PCR primers. However, one featureof dnaN genes among widely different bacteria is their location in thechromosome. They appear to be near the origin, and immediately adjacentto the dnaA gene. The dnaA genes show good homology among differentbacteria and, thus, dnaA was first cloned in order to obtain a DNA probethat is likely near dnaN.

[0462] Identification of dnaA and dnaN

[0463] The dnaA genes of E. coli and B. subtilis share 58% identity atthe amino acid sequence level within the ATP-binding domain (or amongthe representatives of gram-positive and gram-negative bacteria,evolutionary divergent organisms). Comparison of the predicted aminoacid sequences encoded by dnaA of E. coli and B. subtilis revealed twohighly conserved regions (FIG. 19). Within each of these regions, aseven amino acid sequence was used to design two oligonucleotide primersfor use in the polymerase chain reaction. The DNA oligonucleotides foramplification of T.th. genomic DNA were as follows. The upstream 20 mer(5′-GTSCTSGTSAAGACSCACTT-3′) (SEQ. ID. No. 50) encodes the followingsequence: VLVKTHL (SEQ. ID. No. 69). The downstream 21 mer(5′-SAGSAGSGCGTTGAASGTGTG-3′, where S is G or C) (SEQ. ID. No. 51)encodes the sequence: HTFNALL (SEQ. ID. No. 71), on the complementarystrand. The amplification reactions contained 10 ng T.th. genomic DNA,0.5 mM of each primer, in a volume of 100 μl of Vent polymerase reactionmixture containing 10 ∞l ThermoPol Buffer, 0.5 mM of each dNTP and 0.5mM MgSO₄. Amplification was performed using the following cyclingscheme:

[0464] 1.5 cycles of: 95.5° C.—30 sec., 45° C.—30 sec., 75° C.—2 min.

[0465] 2.5 cycles of: 95.5° C.—30 sec., 50° C.—30 sec., 75° C.—2 min.

[0466] 3.30 cycles of: 95.5° C.—30 sec., 52° C.—30 sec., 75° C.—30 min.

[0467] Products were visualized in a 1.5% native agarose gel. A fragmentof the expected size of 300 bp was cloned into the SmaI site of pUC19and sequenced with the CircumVent Thermal Cycle DNA sequencing kit (NewEngland Biolabs).

[0468] To obtain a larger section of the T.th. dnaA gene, genomic DNAwas digested with either HaeII, HindIII, KasI, KpnI, MluI, NcoI, NgoMI,NheI, NsiI, PaeR7I, PstI, SacI, SalI, SpeI, SphI, StuI, or XhoI,followed by Southern analysis in a native agarose gel. The filter wasprobed with the 300 bp PCR product radiolabeled by random priming. Fourdifferent restriction digests showed a single fragment of reasonablesize for further cloning. These were, KasI, NgoMI, and StuI, all ofwhich produced fragments of about 3 kb, and NcoI that produced a 2 kbfragment. Also, a KpnI digest resulted in two fragments of about 1.5 kband 10 kb.

[0469] Genomic DNA digests using either NgoMI and StuI were used toobtain the dnaA gene by inverse PCR (also referred to as circular PCR).In this procedure, 0.1 μg of DNA from each digest was treated separatelywith T4 DNA ligase in 50 μl of ligation buffer (50 mM Tris-HCl (pH 7.8),10 mM MgCl₂, 10 mM dithiothreitol, 1 mM ATP, 25 mg/ml bovine serumalbumin) overnight at 20° C. This results in circularizing the genomicDNA fragments. The ligation mixtures were used as substrate in inversePCR.

[0470] DNA oligonucleotides for amplification of recircularized T.th.genomic DNA were as follows. The upstream 22 mer was(5′-CTCGTTGGTGAAAGTTTCCGTG-3′) (SEQ. ID. No. 52), and the downstream 24mer was (5′-CGTCCAGTTCATCGCCGGAAAGGA-3′) (SEQ. ID. No. 53). Theamplification reactions contained 5 ng T.th. genomic DNA, 0.5 μM of eachprimer, in a volume of 100 μl of Taq polymerase reaction mixturecontaining 10 μl PCR Buffer, 0.5 mM of each dNTP and 2.5 mM MgCl₂.Amplification was performed using the following cycling scheme:

[0471] 1.5 cycles of: 95.0° C.—30 sec., 55° C.—30 sec., 72° C.—10 min.

[0472] 2.35 cycles of: 95.5° C.—30 sec., 50° C.—30 sec., 72° C.—8 min.

[0473] The PCR fragments of the expected length for NgoMI and StuItreated and then ligated chromosomal DNA were digested with either BamHIor Sau3a and cloned into pUC19:BamHI and pUC19:(BamHI+SmaI) andsequenced with CircumVent Thermal Cycle DNA sequencing kit. The 1.6 kb(BamHI+BamH) fragment from the NgoMI PCR product contained a sequencecoding for the N-terminal part of dnaN, followed by the gene forenolase. The 1 kb (Sau3a+Sau3a) fragment from the same PCR productincluded the start of dnaN gene and sequence characteristic of theorigin of replication (i.e., 9 mer DnaA-binding site sequences). The 0.6kb (BamHI+BamHI) fragment from the Stul PCR reaction contained startsfor dnaA and gidA genes in inverse orientation to each other. The 0.4 kb(Sau3a+Sau3a) fragment from the same PCR product contained the 3′ end ofthe dnaA gene and DNA sequence characteristic for the origin ofreplication.

[0474] This sequence information provided the beginning and end of boththe dnaA and the dnaN genes. Hence, these genes were easily cloned fromthis information. Further, the dnaN gene was readily cloned andexpressed in a pET24-a vector. These steps are described below.

[0475] Cloning and Sequence of the dnaA Gene

[0476] The dnaA gene was cloned for sequencing in two parts: from thepotential start of the gene up to its middle and from the middle up tothe end. For the N-terminal part, the upstream 27 mer(5′-TCTGGCAACACGTTCTGGAGCACATCC-3′) (SEQ. ID. No. 54) was 20 bpdownsteam of the potential start codon of the gene. The downstream 23mer(5′-TGCTGGCGTTCATCTTCAGGATG-3′) (SEQ. ID. No. 55) was approximately fromthe middle of the dnaA gene. For the C-terminal part, the upstream 23mer(5′-CATCCTGAAGATGAACGCCAGCA-3′) (SEQ. ID. No. 56) was complementary tothe previous primer. The downstream 25 mer(5′-AGGTTATCCACAGGGGTCATGTGCA-3′) (SEQ. ID. No. 57) was 20 bp upstreamthe potential stop codon for the dnaA gene. The amplification reactionscontained 10 ng T.th. genomic DNA, 0.5 μM of each primer, in a volume of100 μl of Vent polymerase reaction mixture containing 10 μl ThermoPolBuffer, 0.5 mM of each dNTP and 0.5 mM MgSO₄. Amplification wasperformed using the following cycling scheme:

[0477] 1.5 cycles of: 95.5° C.—30 sec., 55° C.—30 sec., 75° C.—3 min.

[0478] 2.30 cycles of: 95.5° C.—30 sec., 50° C.—30 sec., 75° C.—2 min.

[0479] Products were visualized in a 1.0% native agarose gel. Fragmentsof the expected sizes of 750 bp and 650 bp were produced, and weresequenced using CircumVent Thermal Cycle DNA sequencing method (NewEngland Biolabs). The nucleotide and amino acid sequences of dnaA andits protein product are shown in FIG. 20. The DnaA protein is homologousto the DnaA proteins of several other bacteria as shown in FIG. 19.

[0480] Cloning and Expression of dnaN

[0481] The full length dnaN gene was obtained by PCR from T.th. totalDNA. DNA oligonucleotides for amplification of T.th. dnaN were thefollowing: the upstream 29 mer (5.′-GTGTGTCATATGAACATAACGGTTCCCAA-3′)(SEQ. ID. No. 58) consists of an NdeI site within first 11 nucleotides(underlined), followed by the sequence for the start of the dnaN gene;the downstream 29 mer (5′-GCGCGAATTCTCCCTTGTGGAAGGCTTAG-3′) (SEQ. ID.No. 59) consists of an EcoRI site within the first 10 nucleotides(underlined), followed by the sequence complementary to a section justdownstream of the dnaN stop codon. The amplification reactions contained10 ng T.th. genomic DNA, 0.5 μM of each primer, in a volume of 100 μl ofVent polymerase reaction mixture containing 10 μl Thermopol Buffer, 0.5mM of each dNTP and 0.2 mM MgSO₄. Amplification was performed using thefollowing cycling scheme:

[0482] 1.5 cycles of: 95.0° C.—30 sec., 55° C.—30 sec., 75° C.—5 min.

[0483] 2.35 cycles of: 95.5° C.—30 sec., 50° C.—30 sec., 75° C.—4 min.

[0484] The nucleotide and amino acid sequences of dnaN and the βsubunit, respectively, are shown in FIG. 21. The T.th. β subunit showslimited homology to the β subunit sequences of several other bacteriaover its entire length (FIG. 22).

[0485] The approximately 1 kb dnaN gene was cloned into the pET24-aexpression vector using the NdeI and EcoRI restriction sites both in thednaN containing PCR product and in pEt24-a (FIG. 23). Expression ofT.th. β subunit was obtained under the following conditions: a freshcolony of B121(DE3) E. coli strain was transformed by the pET24-a:dnaNplasmid, and then was grown in LB broth containing 50 mg/ml kanamycin at37° C. until the cell density reached 0.4 OD₆₀₀. The cell culture wasthen induced for dnaN expression upon addition of 2 mM IPTG. Cells wereharvested after 4 additional hours of growth under 37° C. The inductionof the T.th. β subunit is shown in FIG. 24.

[0486] Two liters of BL21 (DE3)pETdnaNcells were grown in LB mediacontaining 50 mg/ml ampicillin at 37° C. to an O.D. of 0.8 and then IPTGwas added to a concentration of 2 mM. After a further 2 h at 37° C.,cells were harvested by centrifugation and stored at −70° C. Thefollowing steps were performed at 4° C. Cells were thawed andresuspended in 40 ml of 5 mM Tris-HCl (pH 8.0), 1% sucrose, 1M NaCl, 5mM DTT, and 30 mM spermidine. Cells were lysed using a French Pressurecell at 20,000 psi. The lysate was allowed to sit at 4° C. for 30 min.and then cell debris was removed by centrifugation (Sorvall SS-34 rotor,45 min. 18,000 rpm). The supernatant was incubated at 65° C. for 20minutes with occasional stirring. The resulting protein precipitate wasremoved by centrifugation as described above. The supernatant wasdialyzed against 4 liters of buffer A containing 50 mM NaCl overnight.The dialyzed supernatant was clarified by centrifugation (35 ml, 150 mgtotal) and then loaded onto an 8 ml MonoQ column equilibrated in bufferA containing 50 mM NaCl. The column was washed with 5 column volumes ofthe same buffer and then eluted with a 120 ml gradient of buffer A plus50 mM NaCl to buffer A plus 500 mM NaCl. Fractions of 2 ml werecollected. Over 50 mg of T.th. β was recovered in fractions 5-21.

EXAMPLE 13

[0487] Identification and Cloning of T. thermophilus holA

[0488] A search of the incomplete T.th. genome database(www.g21.bio.uni-goettingen.de) showed a match to E. coli δ encoded byholA. The sequence obtained from the database was as follows (SEQ. ID.No. 185): TPKGKDLVRHLENRAKRLGLRLPGGVAQYLA-SLEGDLEALERELEKLALLSP-PLTLEKVEKVVALRPPLTGFDLVRSVLEKDPKEALLRLGRLKEEGEEPLRLLGALSWQFALLARAFFLLREMPRPKEEDLARLEAHPYAAKKALL-EAARRLTEEALKEALDALMEAEKRAKG-GKDPWLALEAAVLRLAR-PAGQ PRVD

[0489] Next, the following PCR primers were designed from the codonusage of T.th.: upstream 27 mer (5′-GCC CAG TAC CTC GCC TCC CTC GAGGGG-3′) (SEQ. ID. No. 186) and downstream 27 mer (5′-GGC CCC CTT GGC CTTCTC GGC CTC CAT-3′ (SEQ. ID. No. 187) to obtain a partial holAnucleotide sequence (SEQ. ID. No. 188): AGACTCGAGG CCCTGGAGCG GGAGCTGGAGAAGCTTGCCC TCCTCTCCCC ACCCCTCACC 60 CTGGAGAAGG TGGAGAAGGT GGTGGCCCTGAGGCCCCCCC TCACGGGCTT TGACCTGGTG 120 CGCTCCGTCC TGGAGAAGGA CCCCAAGGAGGCCCTCCTGC GCCTCAGGCG CCTCAGGGAG 180 GAGGGGGAGG AGCCCCTCAG GCTCCTCGGGGCCCTCTCCT GGCAGTTCGC CCTCCTCGCC 240 CGGGCCTTCT TCCTCCTCCG GGAAAACCCCAGGCCCAAGG AGGAGGACCT CGCCCGCCTC 300 GAGGCCCACC CCTACGCCGC CAAGAAGGCC A331

[0490] This sequence codes for a partial amino acid sequence of theT.th. δ subunit (SEQ. ID. No. 189):RLEALERELEKLALLSPPLTLEKVEKVVALRPPLTGFDLVRSVLEKDPKEALLRLRRLREEGEEPLRLLGALSWQFALLARAFFLLRENPRPKEEDLARL EAHPYAAKKA

[0491] The DNA sequence obtained by PCR (SEQ. ID. No. 188) was used todesign internal primers for inverted PCR. The upstream 31 mer(5′-GTGGTGTCTAGACATCATAACGGTTCTGGCA-3′) (SEQ. ID. NO.190) introduced anXbaI site for cloning holA into a pGEX vector. The downstream 27 mer(5′-GAGGGCCACCACCTTCTCCACCTTCTC-3′) (SEQ. ID. No. 191) encodes holAsequence EKVEKVVAL (aa residues 159-167 of SEQ. ID. No. 158) on thecomplementary strand. The amplification reactions contained 50 ng T.th.genomic DNA and 0.1 uM of each primer in a volume of 100 μl of Ventpolymerase reaction mixture containing 10 μl ThermoPol Buffer, 2.5 mM ofeach dNTP, 2 mM MgSO₄, and 10 μl of formamide. Amplification wasperformed using the following cycling scheme:

[0492] 1.5 cycles of: 95° C.—30 sec., 65° C.—20 sec., 75° C.—5 min.

[0493] 2.5 cycles of: 95° C.—20 sec., 58° C.—10 sec., 75° C.—5 min.

[0494] 3.35 cycles of: 95° C.—20 sec., 50° C.—5 sec., 75° C.—4 min.

[0495] Products were visualized in a 1.0% native agarose gel. A fragmentof 1.5 Kb was gel purified and partially sequenced.

[0496] A different set of primers were used to obtain the 3′-end ofT.th. holA, including an upstream 25 mer(5′-CTCCGTCCTGGAGAAGGACCCCAAG-3′) (SEQ. ID. No. 192) which encoded theamino acid sequence SVLEKDPK from T.th. holA (aa residues 179-186, ofSEQ. ID. No. 158), and a downstream 29 mer(5′-CGCGAATTCAACGCSCTCCTCAAGACSCT-3′ where S=C or G) (SEQ. ID. No. 193)was not related to the holA sequence. The amplification reactionscontained 50 ng T.th. genomic DNA and 0.1 μM of each primer in a volumeof 100 μl of Vent polymerase reaction mixture containing 1.0 μlThermoPol Buffer, 2.5 mM of each dNTP, and 1-2 MM MgSO₄, and 10 μl offormamide. Amplification was performed using the following cyclingscheme:

[0497] 1.5 cycles of: 95° C.—30 sec., 65° C.—20 sec., 75° C.—5 min.

[0498] 2.5 cycles of: 95° C.—20 sec., 55° C.—10 sec., 75° C.—5 min.

[0499] 3.35 cycles of: 95° C.—20 sec., 50° C.—5 sec., 75° C.—4 min.

[0500] Products were visualized in a 1.0% native agarose gel. A fragmentof 1.2 Kb was gel purified and partially sequenced to obtain theremainder of the T.th. holA gene.

[0501] The T.th. holA gene was cloned into the NdeI/EcoRI sites in thepET24 vector using a pair of primers. The upstream 31 mer(5′-GACACTTAACATATGGTCATCGCCTTCACCG-3′) (SEQ. ID. No. 194) contains aNdeI site within the first 15 nucleotides (underlined) and has asequence corresponding to 5′ region of T.th holA. The downstream 38 mer(5′-GTGTGTGAATTCGGGTCAACGGGCGAGGCGGAGGACCG-3′). (SEQ. ID. No. 195)contains a EcoRI site within the first 12 nucleotides (underlined) andhas a sequence complementary to the 3′ end of holA gene.

EXAMPLE 14

[0502] Identification of T.th. holB Encoding δ′ Subunit

[0503] To clone the ends of T.th. holB gene, it was assumed that theorder of genes in Thermus thermophilis could be the same as in relatedDeinococcus radiodurance. Multiple alignment of the upstream neighbor(probable phosphoesterase, DNA repair Rad24c related protein) revealed aconservative region close to the C-terminus of the protein sequence:

1 212 1 2007 DNA Thermus thermophilus 1 tccgggggtg gggttcccag gtagaccccggcccctcccg tgagcccctt tacccaggcc 60 gccacctcct ccaggggggc caaggcgtgcaaggagagga acgtccgcac cacgccctat 120 actagccttg tgagcgccct ctaccgccgcttccgccccc tcaccttcca ggaggtggtg 180 gggcaggagc acgtgaagga gcccctcctcaaggccatcc gggaggggag gctcgcccag 240 gcctacctct tctccgggcc caggggcgtgggcaagacca ccacggcgag gctcctcgcc 300 atggcggtgg ggtgccaggg ggaagaccccccttgcgggg tctgccccca ctgccaggcg 360 gtgcagaggg gcgcccaccc ggacgtggtggacattgacg ccgccagcaa caactccgtg 420 gaggacgtgc gggagctgag ggaaaggatccacctcgccc ccctctctgc ccccaggaag 480 gtcttcatcc tggacgaggc ccacatgctctccaaaagcg ccttcaacgc cctcctcaag 540 accctggagg agcccccgcc ccacgtcctcttcgtcttcg ccaccaccga gcccgagagg 600 atgcccccca ccatcctctc ccgcacccagcacttccgct tccgccgcct cacggaggag 660 gagatcgcct ttaagctccg gcgcatcctggaggccgtgg ggcgggaggc ggaggaggag 720 gccctcctcc tcctcgcccg cctggcggacggggccctta gggacgcgga aagcctcctg 780 gagcgcttcc tcctcctgga aggccccctcacccggaagg aggtggagcg cgccctaggc 840 tcccccccag ggaccggggt ggccgagatcgccgcctccc tcgcgagggg gaaaacggcg 900 gaggccctgg gcctcgcccg gcgcctctacggggaagggt acgccccgag gagcctggtc 960 tcgggccttt tggaggtgtt ccgggaaggcctctacgccg ccttcggcct cgcgggaacc 1020 ccccttcccg ccccgcccca ggccctgatcgccgccatga ccgccctgga cgaggccatg 1080 gagcgcctcg cccgccgctc cgacgccttaagcctggagg tggccctcct ggaggcggga 1140 agggccctgg ccgccgaggc cctaccccagcccacgggcg ctccttcccc agaggtcggc 1200 cccaagccgg aaagcccccc gaccccggaacccccaaggc ccgaggaggc gcccgacctg 1260 cgggagcggt ggcgggcctt cctcgaggccctcaggccca ccctacgggc cttcgtgcgg 1320 gaggcccgcc cggaggtccg ggaaggccagctctgcctcg ctttccccga ggacaaggcc 1380 ttccactacc gcaaggcctc ggaacagaaggtgaggctcc tccccctggc ccaggcccat 1440 ttcggggtgg aggaggtcgt cctcgtcctggagggagaaa aaaaaagcct gagcccaagg 1500 ccccgcccgg ccccacctcc tgaagcgcccgcacccccgg gccctcccga ggaggaggta 1560 gaggcggagg aagcggcgga ggaggccccggaggaggcct tgaggcgggt ggtccgcctc 1620 ctgggggggc gggtgctctg ggtgcggcggcccaggaccc gggaggcgcc ggaggaggaa 1680 cccctgagcc aagacgagat agggggtactggtatataat gggggcatga cgcggaccac 1740 cgacctcgga caagagaccg tggacaacatcctcaagcgc ctccgccgta ttgagggcca 1800 ggtgcggggg ctccagaaga tggtggccgagggccgcccc tgcgacgagg tcctcaccca 1860 gatgaccgcc accaagaagg ccatggaggcggcggccacc ctgatcctcc acgagttcct 1920 gaacgtctgc gccgccgagg tctccgagggcaaggtgaac cccaagaagc ccgaggagat 1980 cgccaccatg ctgaagaact tcatcta 20072 529 PRT Thermus thermophilus 2 Met Ser Ala Leu Tyr Arg Arg Phe Arg ProLeu Thr Phe Gln Glu Val 1 5 10 15 Val Gly Gln Glu His Val Lys Glu ProLeu Leu Lys Ala Ile Arg Glu 20 25 30 Gly Arg Leu Ala Gln Ala Tyr Leu PheSer Gly Pro Arg Gly Val Gly 35 40 45 Lys Thr Thr Thr Ala Arg Leu Leu AlaMet Ala Val Gly Cys Gln Gly 50 55 60 Glu Asp Pro Pro Cys Gly Val Cys ProHis Cys Gln Ala Val Gln Arg 65 70 75 80 Gly Ala His Pro Asp Val Val AspIle Asp Ala Ala Ser Asn Asn Ser 85 90 95 Val Glu Asp Val Arg Glu Leu ArgGlu Arg Ile His Leu Ala Pro Leu 100 105 110 Ser Ala Pro Arg Lys Val PheIle Leu Asp Glu Ala His Met Leu Ser 115 120 125 Lys Ser Ala Phe Asn AlaLeu Leu Lys Thr Leu Glu Glu Pro Pro Pro 130 135 140 His Val Leu Phe ValPhe Ala Thr Thr Glu Pro Glu Arg Met Pro Pro 145 150 155 160 Thr Ile LeuSer Arg Thr Gln His Phe Arg Phe Arg Arg Leu Thr Glu 165 170 175 Glu GluIle Ala Phe Lys Leu Arg Arg Ile Leu Glu Ala Val Gly Arg 180 185 190 GluAla Glu Glu Glu Ala Leu Leu Leu Leu Ala Arg Leu Ala Asp Gly 195 200 205Ala Leu Arg Asp Ala Glu Ser Leu Leu Glu Arg Phe Leu Leu Leu Glu 210 215220 Gly Pro Leu Thr Arg Lys Glu Val Glu Arg Ala Leu Gly Ser Pro Pro 225230 235 240 Gly Thr Gly Val Ala Glu Ile Ala Ala Ser Leu Ala Arg Gly LysThr 245 250 255 Ala Glu Ala Leu Gly Leu Ala Arg Arg Leu Tyr Gly Glu GlyTyr Ala 260 265 270 Pro Arg Ser Leu Val Ser Gly Leu Leu Glu Val Phe ArgGlu Gly Leu 275 280 285 Tyr Ala Ala Phe Gly Leu Ala Gly Thr Pro Leu ProAla Pro Pro Gln 290 295 300 Ala Leu Ile Ala Ala Met Thr Ala Leu Asp GluAla Met Glu Arg Leu 305 310 315 320 Ala Arg Arg Ser Asp Ala Leu Ser LeuGlu Val Ala Leu Leu Glu Ala 325 330 335 Gly Arg Ala Leu Ala Ala Glu AlaLeu Pro Gln Pro Thr Gly Ala Pro 340 345 350 Ser Pro Glu Val Gly Pro LysPro Glu Ser Pro Pro Thr Pro Glu Pro 355 360 365 Pro Arg Pro Glu Glu AlaPro Asp Leu Arg Glu Arg Trp Arg Ala Phe 370 375 380 Leu Glu Ala Leu ArgPro Thr Leu Arg Ala Phe Val Arg Glu Ala Arg 385 390 395 400 Pro Glu ValArg Glu Gly Gln Leu Cys Leu Ala Phe Pro Glu Asp Lys 405 410 415 Ala PheHis Tyr Arg Lys Ala Ser Glu Gln Lys Val Arg Leu Leu Pro 420 425 430 LeuAla Gln Ala His Phe Gly Val Glu Glu Val Val Leu Val Leu Glu 435 440 445Gly Glu Lys Lys Ser Leu Ser Pro Arg Pro Arg Pro Ala Pro Pro Pro 450 455460 Glu Ala Pro Ala Pro Pro Gly Pro Pro Glu Glu Glu Val Glu Ala Glu 465470 475 480 Glu Ala Ala Glu Glu Ala Pro Glu Glu Ala Leu Arg Arg Val ValArg 485 490 495 Leu Leu Gly Gly Arg Val Leu Trp Val Arg Arg Pro Arg ThrArg Glu 500 505 510 Ala Pro Glu Glu Glu Pro Leu Ser Gln Asp Glu Ile GlyGly Thr Gly 515 520 525 Ile 3 1590 DNA Thermus thermophilus 3 gtgagcgccctctaccgccg cttccgcccc ctcaccttcc aggaggtggt ggggcaggag 60 cacgtgaaggagcccctcct caaggccatc cgggagggga ggctcgccca ggcctacctc 120 ttctccgggcccaggggcgt gggcaagacc accacggcga ggctcctcgc catggcggtg 180 gggtgccagggggaagaccc cccttgcggg gtctgccccc actgccaggc ggtgcagagg 240 ggcgcccacccggacgtggt ggacattgac gccgccagca acaactccgt ggaggacgtg 300 cgggagctgagggaaaggat ccacctcgcc cccctctctg cccccaggaa ggtcttcatc 360 ctggacgaggcccacatgct ctccaaaagc gccttcaacg ccctcctcaa gaccctggag 420 gagcccccgccccacgtcct cttcgtcttc gccaccaccg agcccgagag gatgcccccc 480 accatcctctcccgcaccca gcacttccgc ttccgccgcc tcacggagga ggagatcgcc 540 tttaagctccggcgcatcct ggaggccgtg gggcgggagg cggaggagga ggccctcctc 600 ctcctcgcccgcctggcgga cggggccctt agggacgcgg aaagcctcct ggagcgcttc 660 ctcctcctggaaggccccct cacccggaag gaggtggagc gcgccctagg ctccccccca 720 gggaccggggtggccgagat cgccgcctcc ctcgcgaggg ggaaaacggc ggaggccctg 780 ggcctcgcccggcgcctcta cggggaaggg tacgccccga ggagcctggt ctcgggcctt 840 ttggaggtgttccgggaagg cctctacgcc gccttcggcc tcgcgggaac cccccttccc 900 gccccgccccaggccctgat cgccgccatg accgccctgg acgaggccat ggagcgcctc 960 gcccgccgctccgacgcctt aagcctggag gtggccctcc tggaggcggg aagggccctg 1020 gccgccgaggccctacccca gcccacgggc gctccttccc cagaggtcgg ccccaagccg 1080 gaaagccccccgaccccgga acccccaagg cccgaggagg cgcccgacct gcgggagcgg 1140 tggcgggccttcctcgaggc cctcaggccc accctacggg ccttcgtgcg ggaggcccgc 1200 ccggaggtccgggaaggcca gctctgcctc gctttccccg aggacaaggc cttccactac 1260 cgcaaggcctcggaacagaa ggtgaggctc ctccccctgg cccaggccca tttcggggtg 1320 gaggaggtcgtcctcgtcct ggagggagaa aaaaaaagcc tgagcccaag gccccgcccg 1380 gccccacctcctgaagcgcc cgcacccccg ggccctcccg aggaggaggt agaggcggag 1440 gaagcggcggaggaggcccc ggaggaggcc ttgaggcggg tggtccgcct cctggggggg 1500 cgggtgctctgggtgcggcg gcccaggacc cgggaggcgc cggaggagga acccctgagc 1560 caagacgagatagggggtac tggtatataa 1590 4 464 PRT Thermus thermophilus 4 Met Ser AlaLeu Tyr Arg Arg Phe Arg Pro Leu Thr Phe Gln Glu Val 1 5 10 15 Val GlyGln Glu His Val Lys Glu Pro Leu Leu Lys Ala Ile Arg Glu 20 25 30 Gly ArgLeu Ala Gln Ala Tyr Leu Phe Ser Gly Pro Arg Gly Val Gly 35 40 45 Lys ThrThr Thr Ala Arg Leu Leu Ala Met Ala Val Gly Cys Gln Gly 50 55 60 Glu AspPro Pro Cys Gly Val Cys Pro His Cys Gln Ala Val Gln Arg 65 70 75 80 GlyAla His Pro Asp Val Val Asp Ile Asp Ala Ala Ser Asn Asn Ser 85 90 95 ValGlu Asp Val Arg Glu Leu Arg Glu Arg Ile His Leu Ala Pro Leu 100 105 110Ser Ala Pro Arg Lys Val Phe Ile Leu Asp Glu Ala His Met Leu Ser 115 120125 Lys Ser Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro Pro Pro 130135 140 His Val Leu Phe Val Phe Ala Thr Thr Glu Pro Glu Arg Met Pro Pro145 150 155 160 Thr Ile Leu Ser Arg Thr Gln His Phe Arg Phe Arg Arg LeuThr Glu 165 170 175 Glu Glu Ile Ala Phe Lys Leu Arg Arg Ile Leu Glu AlaVal Gly Arg 180 185 190 Glu Ala Glu Glu Glu Ala Leu Leu Leu Leu Ala ArgLeu Ala Asp Gly 195 200 205 Ala Leu Arg Asp Ala Glu Ser Leu Leu Glu ArgPhe Leu Leu Leu Glu 210 215 220 Gly Pro Leu Thr Arg Lys Glu Val Glu ArgAla Leu Gly Ser Pro Pro 225 230 235 240 Gly Thr Gly Val Ala Glu Ile AlaAla Ser Leu Ala Arg Gly Lys Thr 245 250 255 Ala Glu Ala Leu Gly Leu AlaArg Arg Leu Tyr Gly Glu Gly Tyr Ala 260 265 270 Pro Arg Ser Leu Val SerGly Leu Leu Glu Val Phe Arg Glu Gly Leu 275 280 285 Tyr Ala Ala Phe GlyLeu Ala Gly Thr Pro Leu Pro Ala Pro Pro Gln 290 295 300 Ala Leu Ile AlaAla Met Thr Ala Leu Asp Glu Ala Met Glu Arg Leu 305 310 315 320 Ala ArgArg Ser Asp Ala Leu Ser Leu Glu Val Ala Leu Leu Glu Ala 325 330 335 GlyArg Ala Leu Ala Ala Glu Ala Leu Pro Gln Pro Thr Gly Ala Pro 340 345 350Ser Pro Glu Val Gly Pro Lys Pro Glu Ser Pro Pro Thr Pro Glu Pro 355 360365 Pro Arg Pro Glu Glu Ala Pro Asp Leu Arg Glu Arg Trp Arg Ala Phe 370375 380 Leu Glu Ala Leu Arg Pro Thr Leu Arg Ala Phe Val Arg Glu Ala Arg385 390 395 400 Pro Glu Val Arg Glu Gly Gln Leu Cys Leu Ala Phe Pro GluAsp Lys 405 410 415 Ala Phe His Tyr Arg Lys Ala Ser Glu Gln Lys Val ArgLeu Leu Pro 420 425 430 Leu Ala Gln Ala His Phe Gly Val Glu Glu Val ValLeu Val Leu Glu 435 440 445 Gly Glu Lys Lys Lys Pro Glu Pro Lys Ala ProPro Gly Pro Thr Ser 450 455 460 5 454 PRT Thermus thermophilus 5 Met SerAla Leu Tyr Arg Arg Phe Arg Pro Leu Thr Phe Gln Glu Val 1 5 10 15 ValGly Gln Glu His Val Lys Glu Pro Leu Leu Lys Ala Ile Arg Glu 20 25 30 GlyArg Leu Ala Gln Ala Tyr Leu Phe Ser Gly Pro Arg Gly Val Gly 35 40 45 LysThr Thr Thr Ala Arg Leu Leu Ala Met Ala Val Gly Cys Gln Gly 50 55 60 GluAsp Pro Pro Cys Gly Val Cys Pro His Cys Gln Ala Val Gln Arg 65 70 75 80Gly Ala His Pro Asp Val Val Asp Ile Asp Ala Ala Ser Asn Asn Ser 85 90 95Val Glu Asp Val Arg Glu Leu Arg Glu Arg Ile His Leu Ala Pro Leu 100 105110 Ser Ala Pro Arg Lys Val Phe Ile Leu Asp Glu Ala His Met Leu Ser 115120 125 Lys Ser Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro Pro Pro130 135 140 His Val Leu Phe Val Phe Ala Thr Thr Glu Pro Glu Arg Met ProPro 145 150 155 160 Thr Ile Leu Ser Arg Thr Gln His Phe Arg Phe Arg ArgLeu Thr Glu 165 170 175 Glu Glu Ile Ala Phe Lys Leu Arg Arg Ile Leu GluAla Val Gly Arg 180 185 190 Glu Ala Glu Glu Glu Ala Leu Leu Leu Leu AlaArg Leu Ala Asp Gly 195 200 205 Ala Leu Arg Asp Ala Glu Ser Leu Leu GluArg Phe Leu Leu Leu Glu 210 215 220 Gly Pro Leu Thr Arg Lys Glu Val GluArg Ala Leu Gly Ser Pro Pro 225 230 235 240 Gly Thr Gly Val Ala Glu IleAla Ala Ser Leu Ala Arg Gly Lys Thr 245 250 255 Ala Glu Ala Leu Gly LeuAla Arg Arg Leu Tyr Gly Glu Gly Tyr Ala 260 265 270 Pro Arg Ser Leu ValSer Gly Leu Leu Glu Val Phe Arg Glu Gly Leu 275 280 285 Tyr Ala Ala PheGly Leu Ala Gly Thr Pro Leu Pro Ala Pro Pro Gln 290 295 300 Ala Leu IleAla Ala Met Thr Ala Leu Asp Glu Ala Met Glu Arg Leu 305 310 315 320 AlaArg Arg Ser Asp Ala Leu Ser Leu Glu Val Ala Leu Leu Glu Ala 325 330 335Gly Arg Ala Leu Ala Ala Glu Ala Leu Pro Gln Pro Thr Gly Ala Pro 340 345350 Ser Pro Glu Val Gly Pro Lys Pro Glu Ser Pro Pro Thr Pro Glu Pro 355360 365 Pro Arg Pro Glu Glu Ala Pro Asp Leu Arg Glu Arg Trp Arg Ala Phe370 375 380 Leu Glu Ala Leu Arg Pro Thr Leu Arg Ala Phe Val Arg Glu AlaArg 385 390 395 400 Pro Glu Val Arg Glu Gly Gln Leu Cys Leu Ala Phe ProGlu Asp Lys 405 410 415 Ala Phe His Tyr Arg Lys Ala Ser Glu Gln Lys ValArg Leu Leu Pro 420 425 430 Leu Ala Gln Ala His Phe Gly Val Glu Glu ValVal Leu Val Leu Glu 435 440 445 Gly Glu Lys Lys Lys Ala 450 6 32 DNAArtificial Sequence Description of Artificial Sequence primer 6cgcaagcttc acgcstacct sttctccggs ac 32 7 8 PRT Artificial SequenceDescription of Artificial Sequence peptide 7 His Ala Tyr Leu Phe Ser GlyThr 1 5 8 34 DNA Artificial Sequence Description of Artificial Sequenceprimer 8 cgcgaattcg tgctcsggsg gctcctcsag sgtc 34 9 9 PRT ArtificialSequence Description of Artificial Sequence peptide 9 Lys Thr Leu GluGlu Pro Pro Glu His 1 5 10 38 DNA Artificial Sequence Description ofArtificial Sequence primer 10 gcgcggatcc ggagggagaa aaaaaaagcc tcagccca38 11 38 DNA Artificial Sequence Description of Artificial Sequenceprimer 11 gcgcggatcc ggagggagag aagaaaagcc tcagccca 38 12 28 DNAArtificial Sequence Description of Artificial Sequence primer 12gaattaaatt cgcgcttcgg gaggtggg 28 13 27 DNA Artificial SequenceDescription of Artificial Sequence primer 13 gcgcgaattc gcgcttcgggaggtggg 27 14 29 DNA Artificial Sequence Description of ArtificialSequence primer 14 gcgcgaattc gggcgcttca ggaggtggg 29 15 31 DNAArtificial Sequence Description of Artificial Sequence primer 15gtggtgcata tggtgagcgc cctctaccgc c 31 16 31 DNA Artificial SequenceDescription of Artificial Sequence primer 16 gtggtggtcg acccaggagggccacctcca g 31 17 8 PRT Artificial Sequence Description of ArtificialSequence peptide 17 Gly Xaa Xaa Gly Xaa Gly Lys Thr 1 5 18 12 PRTArtificial Sequence Description of Artificial Sequence peptide 18 LysPro Asp Pro Lys Ala Pro Pro Gly Pro Thr Ser 1 5 10 19 180 PRTEscherichia coli 19 Met Ser Tyr Gln Val Leu Ala Arg Lys Trp Arg Pro GlnThr Phe Ala 1 5 10 15 Asp Val Val Gly Gln Glu His Val Leu Thr Ala LeuAla Asn Gly Leu 20 25 30 Ser Leu Gly Arg Ile His His Ala Tyr Leu Phe SerGly Thr Arg Gly 35 40 45 Val Gly Lys Thr Ser Ile Ala Arg Leu Leu Ala LysGly Leu Asn Cys 50 55 60 Glu Thr Gly Ile Thr Ala Thr Pro Cys Gly Val CysAsp Asn Cys Arg 65 70 75 80 Glu Ile Glu Gln Gly Arg Phe Val Asp Leu IleGlu Ile Asp Ala Ala 85 90 95 Ser Arg Thr Lys Val Glu Asp Thr Arg Asp LeuLeu Asp Asn Val Gln 100 105 110 Tyr Ala Pro Ala Arg Gly Arg Phe Lys ValTyr Leu Ile Asp Glu Val 115 120 125 His Met Leu Ser Arg His Ser Phe AsnAla Leu Leu Lys Thr Leu Glu 130 135 140 Glu Pro Pro Glu His Val Lys PheLeu Leu Ala Thr Thr Asp Pro Gln 145 150 155 160 Lys Leu Pro Val Thr IleLeu Ser Arg Cys Leu Gln Phe His Leu Lys 165 170 175 Ala Leu Asp Val 18020 180 PRT Bacillus subtilis 20 Met Ser Tyr Gln Ala Leu Tyr Arg Val PheArg Pro Gln Arg Phe Glu 1 5 10 15 Asp Val Val Gly Gln Glu His Ile ThrLys Thr Leu Gln Asn Ala Leu 20 25 30 Leu Gln Lys Lys Phe Ser His Ala TyrLeu Phe Ser Gly Pro Arg Gly 35 40 45 Thr Gly Lys Thr Ser Ala Ala Lys IlePhe Ala Lys Ala Val Asn Cys 50 55 60 Glu His Ala Pro Val Asp Glu Pro CysAsn Glu Cys Ala Ala Cys Lys 65 70 75 80 Gly Ile Thr Asn Gly Ser Ile SerAsp Val Ile Glu Ile Asp Ala Ala 85 90 95 Ser Asn Asn Gly Val Asp Glu IleArg Asp Ile Arg Asp Lys Val Lys 100 105 110 Phe Ala Pro Ser Ala Val ThrTyr Lys Val Tyr Ile Ile Asp Glu Val 115 120 125 His Met Leu Ser Ile GlyAla Phe Asn Ala Leu Leu Lys Thr Leu Glu 130 135 140 Glu Pro Pro Glu HisCys Ile Phe Ile Leu Ala Thr Thr Glu Pro His 145 150 155 160 Lys Ile ProLeu Thr Ile Ile Ser Arg Cys Gln Arg Phe Asp Phe Lys 165 170 175 Arg IleThr Ser 180 21 294 PRT Escherichia coli 21 Met Ser Tyr Gln Val Leu AlaArg Lys Trp Arg Pro Gln Thr Phe Ala 1 5 10 15 Asp Val Val Gly Gln GluHis Val Leu Thr Ala Leu Ala Asn Gly Leu 20 25 30 Ser Leu Gly Arg Ile HisHis Ala Tyr Leu Phe Ser Gly Thr Arg Gly 35 40 45 Val Gly Lys Thr Ser IleAla Arg Leu Leu Ala Lys Gly Leu Asn Cys 50 55 60 Glu Thr Gly Ile Thr AlaThr Pro Cys Gly Val Cys Asp Asn Cys Arg 65 70 75 80 Glu Ile Glu Gln GlyArg Phe Val Asp Leu Ile Glu Ile Asp Ala Ala 85 90 95 Ser Arg Thr Lys ValGlu Asp Thr Arg Asp Leu Leu Asp Asn Val Gln 100 105 110 Tyr Ala Pro AlaArg Gly Arg Phe Lys Val Tyr Leu Ile Asp Glu Val 115 120 125 His Met LeuSer Arg His Ser Phe Asn Ala Leu Leu Lys Thr Leu Glu 130 135 140 Glu ProPro Glu His Val Lys Phe Leu Leu Ala Thr Thr Asp Pro Gln 145 150 155 160Lys Leu Pro Val Thr Ile Leu Ser Arg Cys Leu Gln Phe His Leu Lys 165 170175 Ala Leu Asp Val Glu Gln Ile Arg His Gln Leu Glu His Ile Leu Asn 180185 190 Glu Glu His Ile Ala His Glu Pro Arg Ala Leu Gln Leu Leu Ala Arg195 200 205 Ala Ala Glu Gly Ser Leu Arg Asp Ala Leu Ser Leu Thr Asp GlnAla 210 215 220 Ile Ala Ser Gly Asp Gly Gln Val Ser Thr Gln Ala Val SerAla Met 225 230 235 240 Leu Gly Thr Leu Asp Asp Asp Gln Ala Leu Ser LeuVal Glu Ala Met 245 250 255 Val Glu Ala Asn Gly Glu Arg Val Met Ala LeuIle Asn Glu Ala Ala 260 265 270 Ala Arg Gly Ile Glu Trp Glu Ala Leu LeuVal Glu Met Leu Gly Leu 275 280 285 Leu His Arg Ile Ala Met 290 22 294PRT Haemophilus influenzae 22 Met Ser Tyr Gln Val Leu Ala Arg Lys TrpArg Pro Lys Thr Phe Ala 1 5 10 15 Asp Val Val Gly Gln Glu His Ile IleThr Ala Leu Ala Asn Gly Leu 20 25 30 Lys Asp Asn Arg Leu His His Ala TyrLeu Phe Ser Gly Thr Arg Gly 35 40 45 Val Gly Lys Thr Ser Ile Ala Arg LeuPhe Ala Lys Gly Leu Asn Cys 50 55 60 Val His Gly Val Thr Ala Thr Pro CysGly Glu Cys Glu Asn Cys Lys 65 70 75 80 Ala Ile Glu Gln Gly Asn Phe IleAsp Leu Ile Glu Ile Asp Ala Ala 85 90 95 Ser Arg Thr Lys Val Glu Asp ThrArg Glu Leu Leu Asp Asn Val Gln 100 105 110 Tyr Lys Pro Val Val Gly ArgPhe Lys Val Tyr Leu Ile Asp Glu Val 115 120 125 His Met Leu Ser Arg HisSer Phe Asn Ala Leu Leu Lys Thr Leu Glu 130 135 140 Glu Pro Pro Glu TyrVal Lys Phe Leu Leu Ala Thr Thr Asp Pro Gln 145 150 155 160 Lys Leu ProVal Thr Ile Leu Ser Arg Cys Leu Gln Phe His Leu Lys 165 170 175 Ala LeuAsp Glu Thr Gln Ile Ser Gln His Leu Ala His Ile Leu Thr 180 185 190 GlnGlu Asn Ile Pro Phe Glu Asp Pro Ala Leu Val Lys Leu Ala Lys 195 200 205Ala Ala Gln Gly Ser Ile Arg Asp Ser Leu Ser Leu Thr Asp Gln Ala 210 215220 Ile Ala Met Gly Asp Arg Gln Val Thr Asn Asn Val Val Ser Asn Met 225230 235 240 Leu Gly Leu Leu Asp Asp Asn Tyr Ser Val Asp Ile Leu Tyr AlaLeu 245 250 255 His Gln Gly Asn Gly Glu Leu Leu Met Arg Thr Leu Gln ArgVal Ala 260 265 270 Asp Ala Ala Gly Asp Trp Asp Lys Leu Leu Gly Glu CysAla Glu Lys 275 280 285 Leu His Gln Ile Ala Leu 290 23 294 PRT Bacillussubtilis 23 Met Ser Tyr Gln Ala Leu Tyr Arg Val Phe Arg Pro Gln Arg PheGlu 1 5 10 15 Asp Val Val Gly Gln Glu His Ile Thr Lys Thr Leu Gln AsnAla Leu 20 25 30 Leu Gln Lys Lys Phe Ser His Ala Tyr Leu Phe Ser Gly ProArg Gly 35 40 45 Thr Gly Lys Thr Ser Ala Ala Lys Ile Phe Ala Lys Ala ValAsn Cys 50 55 60 Glu His Ala Pro Val Asp Glu Pro Cys Asn Glu Cys Ala AlaCys Lys 65 70 75 80 Gly Ile Thr Asn Gly Ser Ile Ser Asp Val Ile Glu IleAsp Ala Ala 85 90 95 Ser Asn Asn Gly Val Asp Glu Ile Arg Asp Ile Arg AspLys Val Lys 100 105 110 Phe Ala Pro Ser Ala Val Thr Tyr Lys Val Tyr IleIle Asp Glu Val 115 120 125 His Met Leu Ser Ile Gly Ala Phe Asn Ala LeuLeu Lys Thr Leu Glu 130 135 140 Glu Pro Pro Glu His Cys Ile Phe Ile LeuAla Thr Thr Glu Pro His 145 150 155 160 Lys Ile Pro Leu Thr Ile Ile SerArg Cys Gln Arg Phe Asp Phe Lys 165 170 175 Arg Ile Thr Ser Gln Ala IleVal Gly Arg Met Asn Lys Ile Val Asp 180 185 190 Ala Glu Gln Leu Gln ValGlu Glu Gly Ser Leu Glu Ile Ile Ala Ser 195 200 205 Ala Ala His Gly GlyMet Arg Asp Ala Leu Ser Leu Leu Asp Gln Ala 210 215 220 Ile Ser Phe SerGly Asp Ile Leu Lys Val Glu Asp Ala Leu Leu Ile 225 230 235 240 Thr GlyAla Val Ser Gln Leu Tyr Ile Gly Lys Leu Ala Lys Ser Leu 245 250 255 HisAsp Lys Asn Val Ser Asp Ala Leu Glu Thr Leu Asn Glu Leu Leu 260 265 270Gln Gln Gly Lys Asp Pro Ala Lys Leu Ile Glu Asp Met Ile Phe Tyr 275 280285 Phe Arg Asp Met Leu Leu 290 24 300 PRT Caulobacter crescentus 24 AspAla Tyr Thr Val Leu Ala Arg Lys Tyr Arg Pro Arg Thr Phe Glu 1 5 10 15Asp Leu Ile Gly Gln Glu Ala Met Val Arg Thr Leu Ala Asn Ala Phe 20 25 30Ser Thr Gly Arg Ile Ala His Ala Phe Met Leu Thr Gly Val Arg Gly 35 40 45Val Gly Lys Thr Thr Thr Ala Arg Leu Leu Ala Arg Ala Leu Asn Tyr 50 55 60Glu Thr Asp Thr Val Lys Gly Pro Ser Val Asp Leu Thr Thr Glu Gly 65 70 7580 Tyr His Cys Arg Ser Ile Ile Glu Gly Arg His Met Asp Val Leu Glu 85 9095 Leu Asp Ala Ala Ser Arg Thr Lys Val Asp Glu Met Arg Glu Leu Leu 100105 110 Asp Gly Val Arg Tyr Ala Pro Val Glu Ala Arg Tyr Lys Val Tyr Ile115 120 125 Ile Asp Glu Val His Met Leu Ser Thr Ala Ala Phe Asn Ala LeuLeu 130 135 140 Lys Thr Leu Glu Glu Pro Pro Pro His Ala Lys Phe Ile PheAla Thr 145 150 155 160 Thr Glu Ile Arg Lys Val Pro Val Thr Ile Leu SerArg Cys Gln Arg 165 170 175 Phe Asp Leu Arg Arg Val Glu Pro Asp Val LeuVal Lys His Phe Asp 180 185 190 Arg Ile Ser Ala Lys Glu Gly Ala Arg IleGlu Met Asp Ala Leu Ala 195 200 205 Leu Ile Ala Arg Ala Ala Glu Gly SerVal Arg Asp Gly Leu Ser Leu 210 215 220 Leu Asp Gln Ala Ile Val Gln ThrGlu Arg Gly Gln Thr Val Thr Ser 225 230 235 240 Thr Val Val Arg Asp MetLeu Gly Leu Ala Asp Arg Ser Gln Thr Ile 245 250 255 Ala Leu Tyr Glu HisVal Met Ala Gly Lys Thr Lys Asp Ala Leu Glu 260 265 270 Gly Phe Arg AlaLeu Trp Gly Phe Gly Ala Asp Pro Ala Val Val Met 275 280 285 Leu Asp ValLeu Asp His Cys His Ala Ser Ala Val 290 295 300 25 260 PRT Mycoplasmagenitalium 25 Met His Gln Val Phe Tyr Gln Lys Tyr Arg Pro Ile Asn PheLys Gln 1 5 10 15 Thr Leu Gly Gln Glu Ser Ile Arg Lys Ile Leu Val AsnAla Ile Asn 20 25 30 Arg Asp Lys Leu Pro Asn Gly Tyr Ile Phe Ser Gly GluArg Gly Thr 35 40 45 Gly Lys Thr Thr Phe Ala Lys Ile Ile Ala Lys Ala IleAsn Cys Leu 50 55 60 Asn Trp Asp Gln Ile Asp Val Cys Asn Ser Cys Asp ValCys Lys Ser 65 70 75 80 Ile Asn Thr Asn Ser Ala Ile Asp Ile Val Glu IleAsp Ala Ala Ser 85 90 95 Lys Asn Gly Ile Asn Asp Ile Arg Glu Leu Val GluAsn Val Phe Asn 100 105 110 His Pro Phe Thr Phe Lys Lys Lys Val Tyr IleLeu Asp Glu Ala His 115 120 125 Met Leu Thr Thr Gln Ser Trp Gly Gly LeuLeu Lys Thr Leu Glu Glu 130 135 140 Ser Pro Pro Tyr Val Leu Phe Ile PheThr Thr Thr Glu Phe Asn Lys 145 150 155 160 Ile Pro Leu Thr Ile Leu SerArg Cys Gln Ser Phe Phe Phe Lys Lys 165 170 175 Ile Thr Ser Asp Leu IleLeu Glu Arg Leu Asn Asp Ile Ala Lys Lys 180 185 190 Glu Lys Ile Lys IleGlu Lys Asp Ala Leu Ile Lys Ile Ala Asp Leu 195 200 205 Ser Gln Gly SerLeu Arg Asp Gly Leu Ser Leu Leu Asp Gln Leu Ala 210 215 220 Ile Ser LeuIle Val Lys Lys Leu Val Leu Leu Met Leu Lys Lys His 225 230 235 240 LeuIle Ser Leu Ile Glu Met Gln Asn Leu Leu Leu Leu Lys Gln Phe 245 250 255Tyr Gln Glu Ile 260 26 289 PRT Thermus thermophilus 26 Val Ser Ala LeuTyr Arg Arg Phe Arg Pro Leu Thr Phe Gln Glu Val 1 5 10 15 Val Gly GlnGlu His Val Lys Glu Pro Leu Leu Lys Ala Ile Arg Glu 20 25 30 Gly Arg LeuAla Gln Ala Tyr Leu Phe Ser Gly Pro Arg Gly Val Gly 35 40 45 Lys Thr ThrThr Ala Arg Leu Leu Ala Met Ala Val Gly Cys Gln Gly 50 55 60 Glu Asp ProPro Cys Gly Val Cys Pro His Cys Gln Ala Val Gln Arg 65 70 75 80 Gly AlaHis Pro Asp Val Val Asp Ile Asp Ala Ala Ser Asn Asn Ser 85 90 95 Val GluAsp Val Arg Glu Leu Arg Glu Arg Ile His Leu Ala Pro Leu 100 105 110 SerAla Pro Arg Lys Val Phe Ile Leu Asp Glu Ala His Met Leu Ser 115 120 125Lys Ser Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu Glu Pro Pro Pro 130 135140 His Val Leu Phe Val Phe Ala Thr Thr Glu Pro Glu Arg Met Pro Pro 145150 155 160 Thr Ile Leu Ser Arg Thr Gln His Phe Arg Phe Arg Arg Leu ThrGlu 165 170 175 Glu Glu Ile Ala Phe Lys Leu Arg Arg Ile Leu Glu Ala ValGly Arg 180 185 190 Glu Ala Glu Glu Glu Ala Leu Leu Leu Leu Ala Arg LeuAla Asp Gly 195 200 205 Ala Leu Arg Asp Ala Glu Ser Leu Leu Glu Arg PheLeu Leu Leu Glu 210 215 220 Gly Pro Leu Thr Arg Lys Glu Val Glu Arg AlaLeu Gly Ser Pro Pro 225 230 235 240 Gly Thr Gly Val Ala Glu Ile Ala AlaSer Leu Ala Arg Gly Lys Thr 245 250 255 Ala Glu Ala Leu Gly Leu Ala ArgArg Leu Tyr Gly Glu Gly Tyr Ala 260 265 270 Pro Arg Ser Leu Val Ser GlyLeu Leu Glu Val Phe Arg Glu Gly Leu 275 280 285 Tyr 27 94 DNA Thermusthermophilus 27 gccggaggga gaaaaaaaaa gccgagccca aggccccgcc cggccccaccccgaagcgcc 60 cgcacccccg ggccccccga ggaggaggag aggc 94 28 11 PRT Thermusthermophilus 28 Val Leu Glu Gly Glu Lys Lys Ser Leu Ser Pro 1 5 10 29 23DNA Artificial Sequence Description of Artificial Sequence primer 29cacgcntacc tnttctccgg nac 23 30 25 DNA Artificial Sequence Descriptionof Artificial Sequence primer 30 gtgctcnggn ggctcctcnt cngtc 25 31 33DNA Artificial Sequence Description of Artificial Sequence primer 31gtgggatccg tggttctgga tctcgatgaa gaa 33 32 29 DNA Artificial SequenceDescription of Artificial Sequence primer 32 gtgggatcca cggsctstcsgagcagaag 29 33 34 DNA Artificial Sequence Description of ArtificialSequence primer 33 gcgggatcct caacgaggac ctctccatct tcaa 34 34 35 DNAArtificial Sequence Description of Artificial Sequence primer 34gcgggatcct tgtcgtcsag sgtsagsgcg tcgta 35 35 39 DNA Artificial SequenceDescription of Artificial Sequence primer 35 gggaaggacc agcgcgtactccccctgctc ctaggtgtg 39 36 27 DNA Artificial Sequence Description ofArtificial Sequence primer 36 gtgtggatcc ttcttcttsc ccatsgc 27 37 27 DNAArtificial Sequence Description of Artificial Sequence primer 37caccgattcc agtggtgcct aggtgtg 27 38 30 DNA Artificial SequenceDescription of Artificial Sequence primer 38 caacacctgg tgttccaggagcctgtgctt 30 39 23 DNA Artificial Sequence Description of ArtificialSequence primer 39 ccagaatcgt ctgctggtcg tag 23 40 19 DNA ArtificialSequence Description of Artificial Sequence primer 40 agcaccctggaggagcttc 19 41 19 DNA Artificial Sequence Description of ArtificialSequence primer 41 catgtcgtac tgggtgtac 19 42 27 DNA Artificial SequenceDescription of Artificial Sequence primer 42 gtsgtsnnsg acnnsgagacsacsggg 27 43 27 DNA Artificial Sequence Description of ArtificialSequence primer 43 gaasccsnng tcgaasnngg cgttgtg 27 44 27 DNA ArtificialSequence Description of Artificial Sequence primer 44 cggggatccacctcaatcac ctcgtgg 27 45 30 DNA Artificial Sequence Description ofArtificial Sequence primer 45 cggggatccg ccaccttgcg gctccgggtg 30 46 31DNA Artificial Sequence Description of Artificial Sequence primer 46gcgctctaga cgagttccca aagcgtgcgg t 31 47 25 DNA Artificial SequenceDescription of Artificial Sequence primer 47 cgcgtctaga tcacctgtat ccaga25 48 33 DNA Artificial Sequence Description of Artificial Sequenceprimer 48 gcggcgcata tggtggtggt cctggacctg gag 33 49 25 DNA ArtificialSequence Description of Artificial Sequence primer 49 cgcgtctagatcacctgtat ccaga 25 50 20 DNA Artificial Sequence Description ofArtificial Sequence primer 50 gtsctsgtsa agacscactt 20 51 21 DNAArtificial Sequence Description of Artificial Sequence primer 51sagsagsgcg ttgaasgtgt g 21 52 22 DNA Artificial Sequence Description ofArtificial Sequence primer 52 ctcgttggtg aaagtttccg tg 22 53 22 DNAArtificial Sequence Description of Artificial Sequence primer 53ctcgttggtg aaagtttccg tg 22 54 27 DNA Artificial Sequence Description ofArtificial Sequence primer 54 tctggcaaca cgttctggag cacatcc 27 55 23 DNAArtificial Sequence Description of Artificial Sequence primer 55tgctggcgtt catcttcagg atg 23 56 23 DNA Artificial Sequence Descriptionof Artificial Sequence primer 56 catcctgaag atgaacgcca gca 23 57 25 DNAArtificial Sequence Description of Artificial Sequence primer 57aggttatcca caggggtcat gtgca 25 58 29 DNA Artificial Sequence Descriptionof Artificial Sequence primer 58 gtgtgtcata tgaacataac ggttcccaa 29 5929 DNA Artificial Sequence Description of Artificial Sequence primer 59gcgcgaattc tcccttgtgg aaggcttag 29 60 13 PRT Thermus thermophilus 60 ArgVal Glu Leu Asp Tyr Asp Ala Leu Thr Leu Asp Asp 1 5 10 61 14 PRT Thermusthermophilus 61 Phe Phe Ile Glu Ile Gln Asn His Gly Leu Ser Glu Gln Lys1 5 10 62 8 PRT Thermus thermophilus 62 Phe Phe Ile Glu Ile Gln Asn His1 5 63 8 PRT Thermus thermophilus 63 Tyr Asp Ala Leu Thr Leu Asp Asp 1 564 6 PRT Thermus thermophilus 64 Ala Met Gly Lys Lys Lys 1 5 65 9 PRTThermus thermophilus 65 Phe Asn Lys Ser His Ser Ala Ala Tyr 1 5 66 9 PRTArtificial Sequence Description of Artificial Sequence peptide 66 ValVal Xaa Asp Xaa Glu Thr Thr Gly 1 5 67 9 PRT Artificial SequenceDescription of Artificial Sequence peptide 67 His Asn Ala Xaa Phe AspXaa Gly Phe 1 5 68 9 PRT Artificial Sequence Description of ArtificialSequence peptide 68 Val Val Xaa Asp Xaa Glu Thr Thr Gly 1 5 69 7 PRTThermus thermophilus 69 Val Leu Val Lys Thr His Leu 1 5 70 6 PRTArtificial Sequence Description of Artificial Sequence peptide 70 HisArg Ala Leu Tyr Asp 1 5 71 7 PRT Thermus thermophilus 71 His Thr Phe AsnAla Leu Leu 1 5 72 34 PRT Escherichia coli 72 Asp Arg Tyr Phe Leu GluLeu Ile Arg Thr Gly Arg Pro Asp Glu Glu 1 5 10 15 Ser Tyr Leu His AlaAla Val Glu Leu Ala Glu Ala Arg Gly Leu Pro 20 25 30 Val Val 73 34 PRTVibrio cholerae 73 Asp His Phe Tyr Leu Glu Leu Ile Arg Thr Gly Arg AlaAsp Glu Glu 1 5 10 15 Ser Tyr Leu His Phe Ala Leu Asp Val Ala Glu GlnTyr Asp Leu Pro 20 25 30 Val Val 74 34 PRT Haemophilus influenzae 74 AspHis Phe Tyr Leu Ala Leu Ser Arg Thr Gly Arg Pro Asn Glu Glu 1 5 10 15Arg Tyr Ile Gln Ala Ala Leu Lys Leu Ala Glu Arg Cys Asp Leu Pro 20 25 30Leu Val 75 34 PRT Rickettsia prowazekii 75 Asp Arg Phe Tyr Phe Glu IleMet Arg His Asp Leu Pro Glu Glu Gln 1 5 10 15 Phe Ile Glu Asn Ser TyrIle Gln Ile Ala Ser Glu Leu Ser Ile Pro 20 25 30 Ile Val 76 34 PRTHelicobacter pylori 76 Asp Asp Phe Tyr Leu Glu Ile Met Arg His Gly IleLeu Asp Gln Arg 1 5 10 15 Phe Ile Asp Glu Gln Val Ile Lys Met Ser LeuGlu Thr Gly Leu Lys 20 25 30 Ile Ile 77 34 PRT Synechocystis sp. 77 AspAsp Tyr Tyr Leu Glu Ile Gln Asp His Gly Ser Val Glu Asp Arg 1 5 10 15Leu Val Asn Ile Asn Leu Val Lys Ile Ala Gln Glu Leu Asp Ile Lys 20 25 30Ile Val 78 34 PRT Mycobacterium tuberculosis 78 Asp Asn Tyr Phe Leu GluLeu Met Asp His Gly Leu Thr Ile Glu Arg 1 5 10 15 Arg Val Arg Asp GlyLeu Leu Glu Ile Gly Arg Ala Leu Asn Ile Pro 20 25 30 Pro Leu 79 46 PRTEscherichia coli 79 Asn Lys Arg Arg Ala Lys Asn Gly Glu Pro Pro Leu AspIle Ala Ala 1 5 10 15 Ile Pro Leu Asp Asp Lys Lys Ser Phe Asp Met LeuGln Arg Ser Glu 20 25 30 Thr Thr Ala Val Phe Gln Leu Glu Ser Arg Gly MetLys Asp 35 40 45 80 46 PRT Vibrio cholerae 80 Asn Pro Arg Leu Lys LysAla Gly Lys Pro Pro Val Arg Ile Glu Ala 1 5 10 15 Ile Pro Leu Asp AspAla Arg Ser Phe Arg Asn Leu Gln Asp Ala Lys 20 25 30 Thr Thr Ala Val PheGln Leu Glu Ser Arg Gly Met Lys Glu 35 40 45 81 46 PRT Haemophilusinfluenzae 81 Asn Val Arg Met Val Arg Glu Gly Lys Pro Arg Val Asp IleAla Ala 1 5 10 15 Ile Pro Leu Asp Asp Pro Glu Ser Phe Glu Leu Leu LysArg Ser Glu 20 25 30 Thr Thr Ala Val Phe Gln Leu Glu Ser Arg Gly Met LysAsp 35 40 45 82 46 PRT Rickettsia prowazekii 82 Cys Lys Lys Leu Leu LysGlu Gln Gly Ile Lys Ile Asp Phe Asp Asp 1 5 10 15 Met Thr Phe Asp AspLys Lys Thr Tyr Gln Met Leu Cys Lys Gly Lys 20 25 30 Gly Val Gly Val PheGln Phe Glu Ser Ile Gly Met Lys Asp 35 40 45 83 45 PRT Helicobacterpylori 83 Leu Lys Ile Ile Lys Thr Gln His Lys Ile Ser Val Asp Phe LeuSer 1 5 10 15 Leu Asp Met Asp Asp Pro Lys Val Tyr Lys Thr Ile Gln SerGly Asp 20 25 30 Thr Val Gly Ile Phe Gln Ile Glu Ser Gly Met Phe Gln 3540 45 84 46 PRT Synechocystis sp. 84 Gln Glu Arg Lys Ala Leu Gln Ile ArgAla Arg Thr Gly Ser Lys Lys 1 5 10 15 Leu Pro Asp Asp Val Lys Lys ThrHis Lys Leu Leu Glu Ala Gly Asp 20 25 30 Leu Glu Gly Ile Phe Gln Leu GluSer Gln Gly Met Lys Gln 35 40 45 85 46 PRT Mycobacterium tuberculosis 85Ile Asp Asn Val Arg Ala Asn Arg Gly Ile Asp Leu Asp Leu Glu Ser 1 5 1015 Val Pro Leu Asp Asp Lys Ala Thr Tyr Glu Leu Leu Gly Arg Gly Asp 20 2530 Thr Leu Gly Val Phe Gln Leu Asp Gly Gly Pro Met Arg Asp 35 40 45 863729 DNA Thermus thermophilus 86 atgggccggg agctccgctt cgcccacctccaccagcaca cccagttctc cctcctggac 60 ggggcggcga agctttccga cctcctcaagtgggtcaagg agacgacccc cgaggacccc 120 gccttggcca tgaccgacca cggcaacctcttcggggccg tggagttcta caagaaggcc 180 accgaaatgg gcatcaagcc catcctgggctacgaggcct acgtggcggc ggaaagccgc 240 tttgaccgca agcggggaaa gggcctagacgggggctact ttcacctcac cctcctcgcc 300 aaggacttca cggggtacca gaacctggtgcgcctggcga gccgggctta cctggagggg 360 ttttacgaaa agccccggat tgaccgggagatcctgcgcg agcacgccga gggcctcatc 420 gccctctcgg ggtgcctcgg ggcggagatcccccagttca tcctccagga ccgtctggac 480 ctggccgagg cccggctcaa cgagtacctctccatcttca aggaccgctt cttcatcgag 540 atccagaacc acggcctccc cgagcagaaaaaggtcaacg aggtcctcaa ggagttcgcc 600 cgaaagtacg gcctggggat ggtggccaccaacgacggcc attacgtgag gaaggaggac 660 gcccgcgccc acgaggtcct cctcgccatccagtccaaga gcaccctgga cgaccccggg 720 cgctggcgct tcccctgcga cgagttctacgtgaagaccc ccgaggagat gcgggccatg 780 ttccccgagg aggagtgggg ggacgagccctttgacaaca ccgtggagat cgcccgcatg 840 tgcaacgtgg agctgcccat cggggacaagatggtctacc gaatcccccg cttccccctc 900 cccgaggggc ggaccgaggc ccagtacctcatggagctca ccttcaaggg gctcctccgc 960 cgctacccgg accggatcac cgagggcttctaccgggagg tcttccgcct tttggggaag 1020 cttccccccc acggggacgg ggaggccttggccgaggcct tggcccaggt ggagcgggag 1080 gcttgggaga ggctcatgaa gagcctcccccctttggccg gggtcaagga gtggacggcg 1140 gaggccattt tccaccgggc cctttacgagctttccgtga tagagcgcat ggggtttccc 1200 ggctacttcc tcatcgtcca ggactacatcaactgggccc ggagaaacgg cgtctccgtg 1260 gggcccggca gggggagcgc cgccgggagcctggtggcct acgccgtggg gatcaccaac 1320 attgaccccc tccgcttcgg cctcctctttgagcgcttcc tgaacccgga gagggtctcc 1380 atgcccgaca ttgacacgga cttctccgaccgggagcggg accgggtgat ccagtacgtg 1440 cgggagcgct acggcgagga caaggtggcccagatcggca ccctgggaag cctcgcctcc 1500 aaggccgccc tcaaggacgt ggcccgggtctacggcatcc cccacaagaa ggcggaggaa 1560 ttggccaagc tcatcccggt gcagttcgggaagcccaagc ccctgcagga ggccatccag 1620 gtggtgccgg agcttagggc ggagatggagaaggacccca aggtgcggga ggtcctcgag 1680 gtggccatgc gcctggaggg cctgaaccgccacgcctccg tccacgccgc cggggtggtg 1740 atcgccgccg agcccctcac ggacctcgtccccctcatgc gcgaccagga agggcggccc 1800 gtcacccagt acgacatggg ggcggtggaggccttggggc ttttgaagat ggactttttg 1860 ggcctccgca ccctcacctt cctggacgaggtcaagcgca tcgtcaaggc gtcccagggg 1920 gtggagctgg actacgatgc cctccccctggacgacccca agaccttcgc cctcctctcc 1980 cggggggaga ccaagggggt cttccagctggagtcggggg ggatgaccgc cacgctccgc 2040 ggcctcaagc cgcggcgctt tgaggacctgatcgccatcc tctccctcta ccgccccggg 2100 cccatggagc acatccccac ctacatccgccgccaccacg ggctggagcc cgtgagctac 2160 agcgagtttc cccacgccga gaagtacctaaagcccatcc tggacgagac ctacggcatc 2220 cccgtctacc aggagcagat catgcagatcgcctcggccg tggcggggta ctccctgggc 2280 gaggcggacc tcctgcggcg gtccatgggcaagaagaagg tggaggagat gaagtcccac 2340 cgggagcgct tcgtccaggg ggccaaggaaaggggcgtgc ccgaggagga ggccaaccgc 2400 ctctttgaca tgctggaggc cttcgccaactacggcttca acaaatccca cgctgccgcc 2460 tacagcctcc tctcctacca gaccgcctacgtgaaggccc actaccccgt ggagttcatg 2520 gccgccctcc tctccgtgga gcggcacgactccgacaagg tggccgagta catccgcgac 2580 gcccgggcca tgggcataga ggtccttcccccggacgtca accgctccgg gtttgacttc 2640 ctggtccagg gccggcagat ccttttcggcctctccgcgg tgaagaacgt gggcgaggcg 2700 gcggcggagg ccattctccg ggagcgggagcggggcggcc cctaccggag cctcggcgac 2760 ttcctcaagc ggctggacga gaaggtgctcaacaagcgga ccctggagtc cctcatcaag 2820 gcgggcgccc tggacggctt cggggaaagggcgcggctcc tcgcctccct ggaagggctc 2880 ctcaagtggg cggccgagaa ccgggagaaggcccgctcgg gcatgatggg cctcttcagc 2940 gaagtggagg agccgccttt ggccgaggccgcccccctgg acgagatcac ccggctccgc 3000 tacgagaagg aggccctggg gatctacgtctccggccacc ccatcttgcg gtaccccggg 3060 ctccgggaga cggccacctg caccctggaggagcttcccc acctggcccg ggacctgccg 3120 ccccggtcta gggtcctcct tgccgggatggtggaggagg tggtgcgcaa gcccacaaag 3180 agcggcggga tgatggcccg cttcgtcctctccgacgaga cgggggcgct tgaggcggtg 3240 gcattcggcc gggcctacga ccaggtctccccgaggctca aggaggacac ccccgtgctc 3300 gtcctcgccg aggtggagcg ggaggaggggggcgtgcggg tgctggccca ggccgtttgg 3360 acctacgagg agctggagca ggtcccccgggccctcgagg tggaggtgga ggcctccctc 3420 ctggacgacc ggggggtggc ccacctgaaaagcctcctgg acgagcacgc ggggaccctc 3480 cccctgtacg tccgggtcca gggcgccttcggcgaggccc tcctcgccct gagggaggtg 3540 cgggtggggg aggaggctgt aggcggccgcgtggttccgg gcctacctcc tgcccgaccg 3600 ggaggtcctt ctccagggcg gccaggcgggggaggcccag gaggcggtgc ccttctaggg 3660 ggtgggccgt gagacctagc gccatcgttctcgccggggg caaggaggcc tgggcccgac 3720 cccttttgg 3729 87 1245 PRT Thermusthermophilus 87 Met Gly Arg Glu Leu Arg Phe Ala His Leu His Gln His ThrGln Phe 1 5 10 15 Ser Leu Leu Asp Gly Ala Pro Lys Leu Ser Asp Leu LeuLys Trp Val 20 25 30 Glu Glu Thr Thr Pro Glu Asp Pro Ala Leu Ala Met ThrAsp His Gly 35 40 45 Asn Leu Phe Gly Ala Val Glu Phe Tyr Lys Lys Ala ThrGlu Met Gly 50 55 60 Ile Lys Pro Ile Leu Gly Tyr Glu Ala Tyr Val Ala AlaGlu Ser Arg 65 70 75 80 Phe Asp Arg Lys Arg Gly Lys Gly Leu Asp Gly GlyTyr Phe His Leu 85 90 95 Thr Leu Leu Ala Lys Asp Phe Thr Gly Tyr Gln AsnLeu Val Arg Leu 100 105 110 Ala Ser Arg Ala Tyr Leu Glu Gly Phe Tyr GluLys Pro Arg Ile Asp 115 120 125 Arg Glu Ile Leu Arg Glu His Ala Glu GlyLeu Ile Ala Leu Ser Gly 130 135 140 Cys Leu Gly Ala Glu Ile Pro Gln PheIle Leu Gln Asp Arg Leu Asp 145 150 155 160 Leu Ala Glu Ala Arg Leu AsnGlu Tyr Leu Ser Ile Phe Lys Asp Arg 165 170 175 Phe Phe Ile Glu Ile GlnAsn His Gly Leu Pro Glu Gln Lys Lys Val 180 185 190 Asn Glu Val Leu LysGlu Phe Ala Arg Lys Tyr Gly Leu Gly Met Val 195 200 205 Ala Thr Asn AspGly His Tyr Val Arg Lys Glu Asp Ala Arg Ala His 210 215 220 Glu Val LeuLeu Ala Ile Gln Ser Lys Ser Thr Leu Asp Asp Pro Gly 225 230 235 240 AlaLeu Ala Leu Pro Cys Glu Glu Phe Tyr Val Lys Thr Pro Glu Glu 245 250 255Met Arg Ala Met Phe Pro Glu Glu Glu Val Gly Gly Arg Ser Pro Leu 260 265270 Thr Thr Pro Trp Arg Ser Pro His Val Gln Arg Gly Ala Ala Ile Gly 275280 285 Thr Arg Trp Ser Thr Arg Ile Pro Arg Phe Pro Leu Pro Glu Gly Arg290 295 300 Thr Glu Ala Gln Tyr Leu Met Glu Leu Thr Phe Lys Gly Leu LeuArg 305 310 315 320 Arg Tyr Pro Asp Arg Ile Thr Glu Gly Phe Tyr Arg GluVal Phe Arg 325 330 335 Leu Ser Gly Lys Leu Pro Pro His Gly Asp Gly GluAla Leu Ala Glu 340 345 350 Ala Leu Ala Gln Val Glu Arg Glu Ala Trp GluArg Leu Met Lys Ser 355 360 365 Leu Pro Pro Leu Ala Gly Val Lys Glu TrpThr Ala Glu Ala Ile Phe 370 375 380 His Arg Ala Leu Tyr Glu Leu Ser AlaIle Glu Arg Met Gly Phe Pro 385 390 395 400 Gly Leu Leu Pro His Arg ProGly Leu His Gln Leu Gly Pro Glu Lys 405 410 415 Gly Val Ser Val Gly ProGly Arg Gly Gly Ala Ala Gly Ser Leu Val 420 425 430 Ala Tyr Ala Val GlyIle Thr Asn Ile Asp Pro Leu Arg Phe Gly Leu 435 440 445 Leu Phe Glu ArgPhe Leu Asn Pro Glu Arg Val Ser Met Pro Asp Ile 450 455 460 Asp Thr AspPhe Ser Asp Arg Glu Arg Asp Arg Val Ile Gln Tyr Val 465 470 475 480 ArgGlu Arg Tyr Gly Glu Asp Lys Val Ala Gln Ile Gly Thr Leu Gly 485 490 495Ser Leu Ala Ser Lys Ala Ala Leu Lys Glu Val Ala Arg Val Tyr Gly 500 505510 Ile Pro Arg Lys Lys Ala Glu Glu Leu Ala Lys Leu Ile Pro Val Gln 515520 525 Phe Gly Lys Pro Lys Pro Leu Gln Glu Ala Ile Gln Val Val Pro Glu530 535 540 Leu Arg Ala Glu Met Glu Lys Asp Pro Lys Val Arg Glu Val LeuGlu 545 550 555 560 Val Ala Met Arg Leu Glu Gly Leu Asn Arg His Ala SerVal His Ala 565 570 575 Gly Arg Gly Gly Val Phe Ser Glu Pro Leu Thr AspLeu Val Pro Leu 580 585 590 Cys Ala Thr Arg Lys Gly Gly Pro Tyr Thr GlnTyr Asp Met Gly Ala 595 600 605 Val Glu Ala Leu Gly Leu Leu Lys Met AspPhe Leu Gly Leu Arg Thr 610 615 620 Leu Thr Phe Leu Asp Glu Val Lys ArgIle Val Lys Ala Ser Gln Gly 625 630 635 640 Val Glu Leu Asp Tyr Asp AlaLeu Pro Leu Asp Asp Pro Lys Thr Phe 645 650 655 Ala Leu Leu Ser Arg GlyGlu Thr Lys Gly Val Phe Gln Leu Glu Ser 660 665 670 Gly Gly Met Thr AlaThr Leu Arg Gly Leu Lys Pro Arg Arg Phe Glu 675 680 685 Asp Leu Ile AlaIle Leu Ser Leu Tyr Arg Pro Gly Pro Met Glu His 690 695 700 Ile Pro ThrTyr Ile Arg Arg His His Gly Leu Glu Pro Val Ser Tyr 705 710 715 720 SerGlu Phe Pro His Ala Glu Lys Tyr Leu Lys Pro Ile Leu Asp Glu 725 730 735Thr Tyr Gly Ile Pro Val Tyr Gln Glu Gln Ile Met Gln Ile Ala Ser 740 745750 Ala Val Ala Gly Tyr Ser Leu Gly Glu Ala Asp Leu Leu Arg Arg Ser 755760 765 Met Gly Lys Lys Lys Val Glu Glu Met Lys Ser His Arg Glu Arg Phe770 775 780 Val Gln Gly Ala Lys Glu Arg Gly Val Pro Glu Glu Glu Ala AsnArg 785 790 795 800 Leu Phe Asp Met Leu Glu Ala Phe Ala Asn Tyr Gly PheAsn Lys Ser 805 810 815 His Ala Ala Ala Tyr Ser Leu Leu Ser Tyr Gln ThrAla Tyr Val Lys 820 825 830 Ala His Tyr Pro Val Glu Phe Met Ala Ala LeuLeu Ser Val Glu Arg 835 840 845 His Asp Ser Asp Lys Val Ala Glu Tyr IleArg Asp Ala Arg Ala Met 850 855 860 Gly Ile Glu Val Leu Pro Pro Asp ValAsn Arg Ser Gly Phe Asp Phe 865 870 875 880 Leu Val Gln Gly Arg Gln IleLeu Phe Gly Leu Ser Ala Val Lys Asn 885 890 895 Val Gly Glu Ala Ala AlaGlu Ala Ile Leu Arg Glu Arg Glu Arg Gly 900 905 910 Gly Pro Tyr Arg SerLeu Gly Asp Phe Leu Lys Arg Leu Asp Glu Lys 915 920 925 Val Leu Asn LysArg Thr Leu Glu Ser Leu Ile Lys Ala Gly Ala Leu 930 935 940 Asp Gly PheGly Glu Arg Ala Arg Leu Leu Ala Ser Leu Glu Gly Leu 945 950 955 960 LeuLys Trp Ala Ala Glu Asn Arg Glu Lys Ala Arg Ser Gly Met Met 965 970 975Gly Leu Phe Ser Glu Val Glu Glu Pro Pro Leu Ala Glu Ala Ala Pro 980 985990 Leu Asp Glu Ile Thr Arg Leu Arg Tyr Glu Lys Glu Ala Leu Gly Ile 9951000 1005 Tyr Val Ser Gly His Pro Ile Leu Arg Tyr Pro Gly Leu Arg GluThr 1010 1015 1020 Ala Thr Cys Thr Leu Glu Glu Leu Pro His Leu Ala ArgAsp Leu Pro 1025 1030 1035 1040 Pro Arg Ser Arg Val Leu Leu Ala Gly MetVal Glu Glu Val Val Arg 1045 1050 1055 Lys Pro Thr Lys Ser Gly Gly MetMet Ala Arg Phe Val Leu Ser Asp 1060 1065 1070 Glu Thr Gly Ala Leu GluAla Val Ala Phe Gly Arg Ala Tyr Asp Gln 1075 1080 1085 Val Ser Pro ArgLeu Lys Glu Asp Thr Pro Val Leu Val Leu Ala Glu 1090 1095 1100 Val GluArg Glu Glu Gly Gly Val Arg Val Leu Ala Gln Ala Val Trp 1105 1110 11151120 Thr Tyr Gln Glu Leu Glu Gln Val Pro Arg Ala Leu Glu Val Glu Val1125 1130 1135 Glu Ala Ser Leu Pro Asp Asp Arg Gly Val Ala His Leu LysSer Leu 1140 1145 1150 Leu Asp Glu His Ala Gly Thr Leu Pro Leu Tyr ValArg Val Gln Gly 1155 1160 1165 Ala Phe Gly Glu Ala Leu Leu Ala Leu ArgGlu Val Arg Val Gly Glu 1170 1175 1180 Glu Ala Leu Gly Ala Leu Glu AlaAla Gly Phe Pro Ala Tyr Leu Leu 1185 1190 1195 1200 Pro Asn Arg Glu ValSer Pro Arg Leu Thr Gly Ser Gly Gly Pro Arg 1205 1210 1215 Gly Arg AlaLeu Ser Thr Gly Leu Ala Leu Lys Thr Tyr Pro Ile Ala 1220 1225 1230 LeuPro Gly Gly Asn Glu Ala Leu Ala Arg Pro Leu Leu 1235 1240 1245 88 198PRT Thermus thermophilus 88 Val Glu Arg Val Val Arg Thr Leu Leu Asp GlyArg Phe Leu Leu Glu 1 5 10 15 Glu Gly Val Gly Leu Trp Glu Trp Arg TyrPro Phe Pro Leu Glu Gly 20 25 30 Glu Ala Val Val Val Leu Asp Leu Glu ThrThr Gly Leu Ala Gly Leu 35 40 45 Asp Glu Val Ile Glu Val Gly Leu Leu ArgLeu Glu Gly Gly Arg Arg 50 55 60 Leu Pro Phe Gln Ser Leu Val Arg Pro LeuPro Pro Ala Glu Ala Arg 65 70 75 80 Ser Trp Asn Leu Thr Gly Ile Pro ArgGlu Ala Leu Glu Glu Ala Pro 85 90 95 Ser Leu Glu Glu Val Leu Glu Lys AlaTyr Pro Leu Arg Gly Asp Ala 100 105 110 Thr Leu Val Ile His Asn Ala AlaPhe Asp Leu Gly Phe Leu Arg Pro 115 120 125 Ala Leu Glu Gly Leu Gly TyrArg Leu Glu Asn Pro Val Val Asp Ser 130 135 140 Leu Arg Leu Ala Arg ArgGly Leu Pro Gly Leu Arg Arg Tyr Gly Leu 145 150 155 160 Asp Ala Leu SerGlu Val Leu Glu Leu Pro Arg Arg Thr Cys His Arg 165 170 175 Ala Leu GluAsp Val Glu Arg Thr Leu Ala Val Val His Glu Val Tyr 180 185 190 Tyr MetLeu Thr Ser Gly 195 89 182 PRT Deinococcus radiodurans PEPTIDE (79) X atposition 79 is undefined 89 Pro Trp Pro Gln Asp Val Val Val Phe Asp LeuGlu Thr Thr Gly Phe 1 5 10 15 Ser Pro Ala Ser Ala Ala Ile Val Glu IleGly Ala Val Arg Ile Val 20 25 30 Gly Gly Gln Ile Asp Glu Thr Leu Lys PheGlu Thr Leu Val Arg Pro 35 40 45 Thr Arg Pro Asp Gly Ser Met Leu Ser IlePro Trp Gln Ala Gln Arg 50 55 60 Val His Gly Ile Ser Asp Glu Met Val ArgArg Ala Pro Ala Xaa Lys 65 70 75 80 Asp Val Leu Pro Asp Phe Phe Asp PheVal Asp Gly Ser Ala Val Val 85 90 95 Ala His Asn Val Ser Phe Asp Gly GlyPhe Met Arg Ala Gly Ala Glu 100 105 110 Arg Leu Gly Leu Ser Trp Ala ProGlu Arg Glu Leu Cys Thr Met Gln 115 120 125 Leu Ser Arg Arg Ala Phe ProArg Glu Arg Thr His Asn Leu Thr Val 130 135 140 Leu Ala Glu Arg Leu GlyLeu Glu Phe Ala Pro Gly Gly Arg His Arg 145 150 155 160 Ser Tyr Gly AspVal Gln Val Thr Ala Gln Ala Tyr Leu Arg Leu Leu 165 170 175 Glu Leu LeuGly Glu Arg 180 90 201 PRT Bacillus subtilis 90 His Gly Ile Lys Met IleTyr Gly Met Glu Ala Asn Leu Val Asp Asp 1 5 10 15 Gly Val Pro Ile AlaTyr Asn Ala Ala His Arg Leu Leu Glu Glu Glu 20 25 30 Thr Tyr Val Val PheAsp Val Glu Thr Thr Gly Leu Ser Ala Val Tyr 35 40 45 Asp Thr Ile Ile GluLeu Ala Ala Val Lys Val Lys Gly Gly Glu Ile 50 55 60 Ile Asp Lys Phe GluAla Phe Ala Asn Pro His Arg Pro Leu Ser Ala 65 70 75 80 Thr Ile Ile GluLeu Thr Gly Ile Thr Asp Asp Met Leu Gln Asp Ala 85 90 95 Pro Asp Val ValAsp Val Ile Arg Asp Phe Arg Glu Trp Ile Gly Asp 100 105 110 Asp Ile LeuVal Ala His Asn Ala Ser Phe Asp Met Gly Phe Leu Asn 115 120 125 Val AlaTyr Lys Lys Leu Leu Glu Val Glu Lys Ala Lys Asn Pro Val 130 135 140 IleAsp Thr Leu Glu Leu Gly Arg Phe Leu Tyr Pro Glu Phe Lys Asn 145 150 155160 His Arg Leu Asn Thr Leu Cys Lys Lys Phe Asp Ile Glu Leu Thr Gln 165170 175 His His Arg Ala Ile Tyr Asp Thr Glu Ala Thr Ala Tyr Leu Leu Leu180 185 190 Lys Met Leu Lys Asp Ala Ala Glu Lys 195 200 91 188 PRTHaemophilus influenzae PEPTIDE (47) X at position 47 is undefined 91 MetIle Asn Pro Asn Arg Gln Ile Val Leu Asp Thr Glu Thr Thr Gly 1 5 10 15Met Asn Gln Leu Gly Ala His Tyr Glu Gly His Cys Ile Ile Glu Ile 20 25 30Gly Ala Val Glu Leu Ile Asn Arg Arg Tyr Thr Gly Asn Asn Xaa His 35 40 45Ile Tyr Ile Lys Pro Asp Arg Pro Xaa Asp Pro Asp Ala Ile Lys Val 50 55 60His Gly Ile Thr Asp Glu Met Leu Ala Asp Lys Pro Glu Phe Lys Glu 65 70 7580 Val Ala Gln Asp Phe Leu Asp Tyr Ile Asn Gly Ala Glu Leu Leu Ile 85 9095 His Asn Ala Pro Phe Asp Val Gly Phe Met Asp Tyr Glu Phe Arg Lys 100105 110 Leu Asn Leu Asn Val Lys Thr Asp Asp Ile Cys Leu Val Thr Asp Thr115 120 125 Leu Gln Met Ala Arg Gln Met Tyr Pro Gly Lys Arg Asn Asn LeuAsp 130 135 140 Ala Leu Cys Asp Arg Leu Gly Ile Asp Asn Ser Lys Arg ThrLeu His 145 150 155 160 Gly Ala Leu Leu Asp Ala Glu Ile Leu Ala Asp ValTyr Leu Met Met 165 170 175 Thr Gly Gly Gln Thr Asn Leu Phe Asp Glu GluGlu 180 185 92 189 PRT Escherichia coli 92 Met Ser Thr Ala Ile Thr ArgGln Ile Val Leu Asp Thr Glu Thr Thr 1 5 10 15 Gly Met Asn Gln Ile GlyAla His Ser Glu Gly His Lys Ile Ile Glu 20 25 30 Ile Gly Ala Val Glu ValVal Asn Arg Arg Leu Thr Gly Asn Asn Phe 35 40 45 His Val Tyr Leu Lys AspArg Leu Val Asp Pro Glu Ala Phe Gly Val 50 55 60 His Gly Ile Ala Val AspPhe Leu Leu Asp Lys Pro Thr Phe Ala Glu 65 70 75 80 Val Ala Val Glu PheMet Asp Tyr Ile Arg Gly Ala Glu Leu Val Ile 85 90 95 His Asn Ala Ala PheAsp Ile Gly Phe Met Asp Tyr Glu Phe Ser Leu 100 105 110 Leu Lys Arg AspIle Ala Lys Thr Asn Thr Phe Cys Lys Val Thr Asp 115 120 125 Ser Leu AlaVal Ala Arg Lys Met Phe Pro Gly Lys Arg Asn Ser Leu 130 135 140 Asp AlaLeu Cys Ala Arg Tyr Glu Ile Asp Asn Ser Lys Arg Thr Leu 145 150 155 160His Gly Ala Leu Leu Asp Ala Gln Ile Leu Ala Glu Val Tyr Leu Ala 165 170175 Met Thr Gly Gly Gln Thr Ser Met Ala Phe Ala Met Glu 180 185 93 201PRT Helicobacter pylori 93 Asn Leu Glu Tyr Leu Lys Ala Cys Gly Leu AsnPhe Ile Glu Thr Ser 1 5 10 15 Glu Asn Leu Ile Thr Leu Lys Asn Leu LysThr Pro Leu Lys Asp Glu 20 25 30 Val Phe Ser Phe Ile Asp Leu Glu Thr ThrGly Ser Cys Pro Ile Lys 35 40 45 His Glu Ile Leu Glu Ile Gly Ala Val GlnVal Lys Gly Gly Glu Ile 50 55 60 Ile Asn Arg Phe Glu Thr Leu Val Lys ValLys Ser Val Pro Asp Tyr 65 70 75 80 Ile Ala Glu Leu Thr Gly Ile Thr TyrGlu Asp Thr Leu Asn Ala Pro 85 90 95 Ser Ala His Glu Ala Leu Gln Glu LeuArg Leu Phe Leu Gly Asn Ser 100 105 110 Val Phe Val Ala His Asn Ala AsnPhe Asp Tyr Asn Phe Leu Gly Arg 115 120 125 Tyr Phe Val Glu Lys Leu HisCys Pro Leu Leu Asn Leu Lys Leu Cys 130 135 140 Thr Leu Asp Leu Ser LysArg Ala Ile Leu Ser Met Arg Tyr Ser Leu 145 150 155 160 Ser Phe Leu LysGlu Leu Leu Gly Phe Gly Ile Glu Val Ser His Arg 165 170 175 Ala Tyr AlaAsp Ala Leu Ala Ser Tyr Lys Leu Phe Glu Ile Cys Leu 180 185 190 Leu AsnLeu Pro Ser Tyr Ile Lys Thr 195 200 94 630 DNA Thermus thermophilus 94atggtggagc gggtggtgcg gacccttctg gacgggaggt tcctcctgga ggagggggtg 60gggctttggg agtggcgcta cccctttccc ctggaggggg aggcggtggt ggtcctggac 120ctggagacca cggggcttgc cggcctggac gaggtgattg aggtgggcct cctccgcctg 180gaggggggga ggcgcctccc cttccagagc ctcgtccggc ccctcccgcc cgccgaagcc 240cgttcgtgga acctcaccgg catcccccgg gaggccctgg aggaggcccc ctccctggag 300gaggttctgg agaaggccta ccccctccgc ggcgacgcca ccttggtgat ccacaacgcc 360gcctttgacc tgggcttcct ccgcccggcc ttggagggcc tgggctaccg cctggaaaac 420cccgtggtgg actccctgcg cttggccaga cggggcttac caggccttag gcgctacggc 480ctggacgccc tctccgaggt cctggagctt ccccgaagga cctgccaccg ggccctcgag 540gacgtggagc gcaccctcgc cgtggtgcac gaggtatact atatgcttac gtccggccgt 600ccccgcacgc tttgggaact cgggaggtag 630 95 210 PRT Thermus thermophilus 95Met Val Glu Arg Val Val Arg Thr Leu Leu Asp Gly Arg Phe Leu Leu 1 5 1015 Glu Glu Gly Val Gly Leu Trp Glu Trp Arg Tyr Pro Phe Pro Leu Glu 20 2530 Gly Glu Ala Val Val Val Leu Asp Leu Glu Thr Thr Gly Leu Ala Gly 35 4045 Leu Asp Glu Val Ile Glu Val Gly Leu Leu Arg Leu Glu Gly Gly Arg 50 5560 Arg Leu Pro Phe Gln Ser Leu Val Arg Pro Leu Pro Pro Ala Glu Ala 65 7075 80 Arg Ser Trp Asn Leu Thr Gly Ile Pro Arg Glu Ala Leu Glu Glu Ala 8590 95 Pro Ser Leu Glu Glu Val Leu Glu Lys Ala Tyr Pro Leu Arg Gly Asp100 105 110 Ala Thr Leu Val Ile His Asn Ala Ala Phe Asp Leu Gly Phe LeuArg 115 120 125 Pro Ala Leu Glu Gly Leu Gly Tyr Arg Leu Glu Asn Pro ValVal Asp 130 135 140 Ser Leu Arg Leu Ala Arg Arg Gly Leu Pro Gly Leu ArgArg Tyr Gly 145 150 155 160 Leu Asp Ala Leu Ser Glu Val Leu Glu Leu ProArg Arg Thr Cys His 165 170 175 Arg Ala Leu Glu Asp Val Glu Arg Thr LeuAla Val Val His Glu Val 180 185 190 Tyr Tyr Met Leu Thr Ser Gly Arg ProArg Thr Leu Trp Glu Leu Gly 195 200 205 Arg Glx 210 96 461 PRTPseudomonas marcesans 96 Met Leu Glu Ala Ser Trp Glu Lys Val Gln Ser SerLeu Lys Gln Asn 1 5 10 15 Leu Ser Lys Pro Ser Tyr Glu Thr Trp Ile ArgPro Thr Glu Phe Ser 20 25 30 Gly Phe Lys Asn Gly Glu Leu Thr Leu Ile AlaPro Asn Ser Phe Ser 35 40 45 Ser Ala Trp Leu Lys Asn Asn Tyr Ser Gln ThrIle Gln Glu Thr Ala 50 55 60 Glu Glu Ile Phe Gly Glu Pro Val Thr Val HisVal Lys Val Lys Ala 65 70 75 80 Asn Ala Glu Ser Ser Asp Glu His Tyr SerSer Ala Pro Ile Thr Pro 85 90 95 Pro Leu Glu Ala Ser Pro Gly Ser Val AspSer Ser Gly Ser Ser Leu 100 105 110 Arg Leu Ser Lys Lys Thr Leu Pro LeuLeu Asn Leu Arg Tyr Val Phe 115 120 125 Asn Arg Phe Val Val Gly Pro AsnSer Arg Met Ala His Ala Ala Ala 130 135 140 Met Ala Val Ala Glu Ser ProGly Arg Glu Phe Asn Pro Leu Phe Ile 145 150 155 160 Cys Gly Gly Val GlyLeu Gly Lys Thr His Leu Met Gln Ala Ile Gly 165 170 175 His Tyr Arg LeuGlu Ile Asp Pro Gly Ala Lys Val Ser Tyr Val Ser 180 185 190 Thr Glu ThrPhe Thr Asn Asp Leu Ile Leu Ala Ile Arg Gln Asp Arg 195 200 205 Met GlnAla Phe Arg Asp Arg Tyr Arg Ala Ala Asp Leu Ile Leu Val 210 215 220 AspAsp Ile Gln Phe Ile Glu Gly Lys Glu Tyr Thr Gln Glu Glu Phe 225 230 235240 Phe His Thr Phe Asn Ala Leu His Asp Ala Gly Ser Gln Ile Val Leu 245250 255 Ala Ser Asp Arg Pro Pro Ser Gln Ile Pro Arg Leu Gln Glu Arg Leu260 265 270 Met Ser Arg Phe Ser Met Gly Leu Ile Ala Asp Val Gln Ala ProAsp 275 280 285 Leu Glu Thr Arg Met Ala Ile Leu Gln Lys Lys Ala Glu HisGlu Arg 290 295 300 Val Gly Leu Pro Arg Asp Leu Ile Gln Phe Ile Ala GlyArg Phe Thr 305 310 315 320 Ser Asn Ile Arg Glu Leu Glu Gly Ala Leu ThrArg Ala Ile Ala Phe 325 330 335 Ala Ser Ile Thr Gly Leu Pro Met Thr ValAsp Ser Ile Ala Pro Met 340 345 350 Leu Asp Pro Asn Gly Gln Gly Val GluVal Thr Pro Lys Gln Val Leu 355 360 365 Asp Lys Val Ala Glu Val Phe LysVal Thr Pro Asp Glu Met Arg Ser 370 375 380 Ala Ser Arg Arg Arg Pro ValSer Gln Ala Arg Gln Val Gly Met Tyr 385 390 395 400 Leu Met Arg Gln GlyThr Asn Leu Ser Leu Pro Arg Ile Gly Asp Thr 405 410 415 Phe Gly Gly LysAsp His Thr Thr Val Met Tyr Ala Ile Glu Gln Val 420 425 430 Glu Lys LysLeu Ser Ser Asp Pro Gln Ile Ala Ser Gln Val Gln Lys 435 440 445 Ile ArgAsp Leu Leu Gln Ile Asp Ser Arg Arg Lys Arg 450 455 460 97 447 PRTSynechocystis sp. 97 Met Val Ser Cys Glu Asn Leu Trp Gln Gln Ala Leu AlaIle Leu Ala 1 5 10 15 Thr Gln Leu Thr Lys Pro Ala Phe Asp Thr Trp IleLys Ala Ser Val 20 25 30 Leu Ile Ser Leu Gly Asp Gly Val Ala Thr Ile GlnVal Glu Asn Gly 35 40 45 Phe Val Leu Asn His Leu Gln Lys Ser Tyr Gly ProLeu Leu Met Glu 50 55 60 Val Leu Thr Asp Leu Thr Gly Gln Glu Ile Thr ValLys Leu Ile Thr 65 70 75 80 Asp Gly Leu Glu Pro His Ser Leu Ile Gly GlnGlu Ser Ser Leu Pro 85 90 95 Met Glu Thr Thr Pro Lys Asn Ala Thr Ala LeuAsn Gly Lys Tyr Thr 100 105 110 Phe Ser Arg Phe Val Val Gly Pro Thr AsnArg Met Ala His Ala Ala 115 120 125 Ser Leu Ala Val Ala Glu Ser Pro GlyArg Glu Phe Asn Pro Leu Phe 130 135 140 Leu Cys Gly Gly Val Gly Leu GlyLys Thr His Leu Met Gln Ala Ile 145 150 155 160 Ala His Tyr Arg Leu GluMet Tyr Pro Asn Ala Lys Val Tyr Tyr Val 165 170 175 Ser Thr Glu Arg PheThr Asn Asp Leu Ile Thr Ala Ile Arg Gln Asp 180 185 190 Asn Met Glu AspPhe Arg Ser Tyr Tyr Arg Ser Ala Asp Phe Leu Leu 195 200 205 Ile Asp AspIle Gln Phe Ile Lys Gly Lys Glu Tyr Thr Gln Glu Glu 210 215 220 Phe PheHis Thr Phe Asn Ser Leu His Glu Ala Gly Lys Gln Val Val 225 230 235 240Val Ala Ser Asp Arg Ala Pro Gln Arg Ile Pro Gly Leu Gln Asp Arg 245 250255 Leu Ile Ser Arg Phe Ser Met Gly Leu Ile Ala Asp Ile Gln Val Pro 260265 270 Asp Leu Glu Thr Arg Met Ala Ile Leu Gln Lys Lys Ala Glu Tyr Asp275 280 285 Arg Ile Arg Leu Pro Lys Glu Val Ile Glu Tyr Ile Ala Ser HisTyr 290 295 300 Thr Ser Asn Ile Arg Glu Leu Glu Gly Ala Leu Ile Arg AlaIle Ala 305 310 315 320 Tyr Thr Ser Leu Ser Asn Val Ala Met Thr Val GluAsn Ile Ala Pro 325 330 335 Val Leu Asn Pro Pro Val Glu Lys Val Ala AlaAla Pro Glu Thr Ile 340 345 350 Ile Thr Ile Val Ala Gln His Tyr Gln LeuLys Val Glu Glu Leu Leu 355 360 365 Ser Asn Ser Arg Arg Arg Glu Val SerLeu Ala Arg Gln Val Gly Met 370 375 380 Tyr Leu Met Arg Gln His Thr AspLeu Ser Leu Pro Arg Ile Gly Glu 385 390 395 400 Ala Phe Gly Gly Lys AspHis Thr Thr Val Met Tyr Ser Cys Asp Lys 405 410 415 Ile Thr Gln Leu GlnGln Lys Asp Trp Glu Thr Ser Gln Thr Leu Thr 420 425 430 Ser Leu Ser HisArg Ile Asn Ile Ala Gly Gln Ala Pro Glu Ser 435 440 445 98 446 PRTBacillus subtilis 98 Met Glu Asn Ile Leu Asp Leu Trp Asn Gln Ala Leu AlaGln Ile Glu 1 5 10 15 Lys Lys Leu Ser Lys Pro Ser Phe Glu Thr Trp MetLys Ser Thr Lys 20 25 30 Ala His Ser Leu Gln Gly Asp Thr Leu Thr Ile ThrAla Pro Asn Glu 35 40 45 Phe Ala Arg Asp Trp Leu Glu Ser Arg Tyr Leu HisLeu Ile Ala Asp 50 55 60 Thr Ile Tyr Glu Leu Thr Gly Glu Glu Leu Ser IleLys Phe Val Ile 65 70 75 80 Pro Gln Asn Gln Asp Val Glu Asp Phe Met ProLys Pro Gln Val Lys 85 90 95 Lys Ala Val Lys Glu Asp Thr Ser Asp Phe ProGln Asn Met Leu Asn 100 105 110 Pro Lys Tyr Thr Phe Asp Thr Phe Val IleGly Ser Gly Asn Arg Phe 115 120 125 Ala His Ala Ala Ser Leu Ala Val AlaGlu Ala Pro Ala Lys Ala Tyr 130 135 140 Asn Pro Leu Phe Ile Tyr Gly GlyVal Gly Leu Gly Lys Thr His Leu 145 150 155 160 Met His Ala Ile Gly HisTyr Val Ile Asp His Asn Pro Ser Ala Lys 165 170 175 Val Val Tyr Leu SerSer Glu Lys Phe Thr Asn Glu Phe Ile Asn Ser 180 185 190 Ile Arg Asp AsnLys Ala Val Asp Phe Arg Asn Arg Tyr Arg Asn Val 195 200 205 Asp Val LeuLeu Ile Asp Asp Ile Gln Phe Leu Ala Gly Lys Glu Gln 210 215 220 Thr GlnGlu Glu Phe Phe His Thr Phe Asn Thr Leu His Glu Glu Ser 225 230 235 240Lys Gln Ile Val Ile Ser Ser Asp Arg Pro Pro Lys Glu Ile Pro Thr 245 250255 Leu Glu Asp Arg Leu Arg Ser Arg Phe Glu Trp Gly Leu Ile Thr Asp 260265 270 Ile Thr Pro Pro Asp Leu Glu Thr Arg Ile Ala Ile Leu Arg Lys Lys275 280 285 Ala Lys Ala Glu Gly Leu Asp Ile Pro Asn Glu Val Met Leu TyrIle 290 295 300 Ala Asn Gln Ile Asp Ser Asn Ile Arg Glu Leu Glu Gly AlaLeu Ile 305 310 315 320 Arg Val Val Ala Tyr Ser Ser Leu Ile Asn Lys AspIle Asn Ala Asp 325 330 335 Leu Ala Ala Glu Ala Leu Lys Asp Ile Ile ProSer Ser Lys Pro Lys 340 345 350 Val Ile Thr Ile Lys Glu Ile Gln Arg ValVal Gly Gln Gln Phe Asn 355 360 365 Ile Lys Leu Glu Asp Phe Lys Ala LysLys Arg Thr Lys Ser Val Ala 370 375 380 Phe Pro Arg Gln Ile Ala Met TyrLeu Ser Arg Glu Met Thr Asp Ser 385 390 395 400 Ser Leu Pro Lys Ile GlyGlu Glu Phe Gly Gly Arg Asp His Thr Thr 405 410 415 Val Ile His Ala HisGlu Lys Ile Ser Lys Leu Leu Ala Asp Asp Glu 420 425 430 Gln Leu Gln GlnHis Val Lys Glu Ile Lys Glu Gln Leu Lys 435 440 445 99 507 PRTMycobacterium tuberculosis 99 Met Thr Asp Asp Pro Gly Ser Gly Phe ThrThr Val Trp Asn Ala Val 1 5 10 15 Val Ser Glu Leu Asn Gly Asp Pro LysVal Asp Asp Gly Pro Ser Ser 20 25 30 Asp Ala Asn Leu Ser Ala Pro Leu ThrPro Gln Gln Arg Ala Trp Leu 35 40 45 Asn Leu Val Gln Pro Leu Thr Ile ValGlu Gly Phe Ala Leu Leu Ser 50 55 60 Val Pro Ser Ser Phe Val Gln Asn GluIle Glu Arg His Leu Arg Ala 65 70 75 80 Pro Ile Thr Asp Ala Leu Ser ArgArg Leu Gly His Gln Ile Gln Leu 85 90 95 Gly Val Arg Ile Ala Pro Pro AlaThr Asp Glu Ala Asp Asp Thr Thr 100 105 110 Val Pro Pro Ser Glu Asn ProAla Thr Thr Ser Pro Asp Thr Thr Thr 115 120 125 Asp Asn Asp Glu Ile AspAsp Ser Ala Ala Ala Arg Gly Asp Asn Gln 130 135 140 His Ser Trp Pro SerTyr Phe Thr Glu Arg Pro His Asn Thr Asp Ser 145 150 155 160 Ala Thr AlaGly Val Thr Ser Leu Asn Arg Arg Tyr Thr Phe Asp Thr 165 170 175 Phe ValIle Gly Ala Ser Asn Arg Phe Ala His Ala Ala Ala Leu Ala 180 185 190 IleAla Glu Ala Pro Ala Arg Ala Tyr Asn Pro Leu Phe Ile Trp Gly 195 200 205Glu Ser Gly Leu Gly Lys Thr His Leu Leu His Ala Ala Gly Asn Tyr 210 215220 Ala Gln Arg Leu Phe Pro Gly Met Arg Val Lys Tyr Val Ser Thr Glu 225230 235 240 Glu Phe Thr Asn Asp Phe Ile Asn Ser Leu Arg Asp Asp Arg LysVal 245 250 255 Ala Phe Lys Arg Ser Tyr Arg Asp Val Asp Val Leu Leu ValAsp Asp 260 265 270 Ile Gln Phe Ile Glu Gly Lys Glu Gly Ile Gln Glu GluPhe Phe His 275 280 285 Thr Phe Asn Thr Leu His Asn Ala Asn Lys Gln IleVal Ile Ser Ser 290 295 300 Asp Arg Pro Pro Lys Gln Leu Ala Thr Leu GluAsp Arg Leu Arg Thr 305 310 315 320 Arg Phe Glu Trp Gly Leu Ile Thr AspVal Gln Pro Pro Glu Leu Glu 325 330 335 Thr Arg Ile Ala Ile Leu Arg LysLys Ala Gln Met Glu Arg Leu Ala 340 345 350 Val Pro Asp Asp Val Leu GluLeu Ile Ala Ser Ser Ile Glu Arg Asn 355 360 365 Ile Arg Glu Leu Glu GlyAla Leu Ile Arg Val Thr Ala Phe Ala Ser 370 375 380 Leu Asn Lys Thr ProIle Asp Lys Ala Leu Ala Glu Ile Val Leu Arg 385 390 395 400 Asp Leu IleAla Asp Ala Asn Thr Met Gln Ile Ser Ala Ala Thr Ile 405 410 415 Met AlaAla Thr Ala Glu Tyr Phe Asp Thr Thr Val Glu Glu Leu Arg 420 425 430 GlyPro Gly Lys Thr Arg Ala Leu Ala Gln Ser Arg Gln Ile Ala Met 435 440 445Tyr Leu Cys Arg Glu Leu Thr Asp Leu Ser Leu Pro Lys Ile Gly Gln 450 455460 Ala Phe Gly Arg Asp His Thr Thr Val Met Tyr Ala Gln Arg Lys Ile 465470 475 480 Leu Ser Glu Met Ala Glu Arg Arg Glu Val Phe Asp His Val LysGlu 485 490 495 Leu Thr Thr Arg Ile Arg Gln Arg Ser Lys Arg 500 505 100446 PRT Thermus thermophilus 100 Met Ser His Glu Ala Val Trp Gln His ValLeu Glu His Ile Arg Arg 1 5 10 15 Ser Ile Thr Glu Val Glu Phe His ThrTrp Phe Glu Arg Ile Arg Pro 20 25 30 Leu Gly Ile Arg Asp Gly Val Leu GluLeu Ala Val Pro Thr Ser Phe 35 40 45 Ala Leu Asp Trp Ile Arg Arg His TyrAla Gly Leu Ile Gln Glu Gly 50 55 60 Pro Arg Leu Leu Gly Ala Gln Ala ProArg Phe Glu Leu Arg Val Val 65 70 75 80 Pro Gly Val Val Val Gln Glu AspIle Phe Gln Pro Pro Pro Ser Pro 85 90 95 Pro Ala Gln Ala Gln Pro Glu AspThr Phe Lys Thr Ser Trp Trp Gly 100 105 110 Pro Thr Thr Pro Trp Pro HisGly Gly Ala Val Ala Val Ala Glu Ser 115 120 125 Pro Gly Arg Ala Tyr AsnPro Leu Phe Ile Tyr Gly Gly Arg Gly Leu 130 135 140 Gly Lys Thr Tyr LeuMet His Ala Val Gly Pro Leu Arg Ala Lys Arg 145 150 155 160 Phe Pro HisMet Arg Leu Glu Tyr Val Ser Thr Glu Thr Phe Thr Asn 165 170 175 Glu LeuIle Asn Arg Pro Ser Ala Arg Asp Arg Met Thr Glu Phe Arg 180 185 190 GluArg Tyr Arg Ser Val Asp Leu Leu Leu Val Asp Asp Val Gln Phe 195 200 205Ile Ala Gly Lys Glu Arg Thr Gln Glu Glu Phe Phe His Thr Phe Asn 210 215220 Ala Leu Tyr Glu Ala His Lys Gln Ile Ile Leu Ser Ser Asp Arg Pro 225230 235 240 Pro Lys Asp Ile Leu Thr Leu Glu Ala Arg Leu Arg Ser Arg PheGlu 245 250 255 Trp Gly Leu Ile Thr Asp Asn Pro Ala Pro Asp Leu Glu ThrArg Ile 260 265 270 Ala Ile Leu Lys Met Asn Ala Ser Ser Gly Pro Glu AspPro Glu Asp 275 280 285 Ala Leu Glu Tyr Ile Ala Arg Gln Val Thr Ser AsnIle Arg Glu Trp 290 295 300 Glu Gly Ala Leu Met Arg Ala Ser Pro Phe AlaSer Leu Asn Gly Val 305 310 315 320 Glu Leu Thr Arg Ala Val Ala Ala LysAla Leu Arg His Leu Arg Pro 325 330 335 Arg Glu Leu Glu Ala Asp Pro LeuGlu Ile Ile Arg Lys Ala Ala Gly 340 345 350 Pro Val Arg Pro Glu Thr ProGly Gly Ala His Gly Glu Arg Arg Lys 355 360 365 Lys Glu Val Val Leu ProArg Gln Leu Ala Met Tyr Leu Val Arg Glu 370 375 380 Leu Thr Pro Ala SerLeu Pro Glu Ile Gly Gln Leu Phe Gly Gly Arg 385 390 395 400 Asp His ThrThr Val Arg Tyr Ala Ile Gln Lys Val Gln Glu Leu Ala 405 410 415 Gly LysPro Asp Arg Glu Val Gln Gly Leu Leu Arg Thr Leu Arg Glu 420 425 430 AlaCys Thr Asp Pro Val Asp Asn Leu Trp Ile Thr Cys Gly 435 440 445 101 467PRT Escherichia coli 101 Met Ser Leu Ser Leu Trp Gln Gln Cys Leu Ala ArgLeu Gln Asp Glu 1 5 10 15 Leu Pro Ala Thr Glu Phe Ser Met Trp Ile ArgPro Leu Gln Ala Glu 20 25 30 Leu Ser Asp Asn Thr Leu Ala Leu Tyr Ala ProAsn Arg Phe Val Leu 35 40 45 Asp Trp Val Arg Asp Lys Tyr Leu Asn Asn IleAsn Gly Leu Leu Thr 50 55 60 Ser Phe Cys Gly Ala Asp Ala Pro Gln Leu ArgPhe Glu Val Gly Thr 65 70 75 80 Lys Pro Val Thr Gln Thr Pro Gln Ala AlaVal Thr Ser Asn Val Ala 85 90 95 Ala Pro Ala Gln Val Ala Gln Thr Gln ProGln Arg Ala Ala Pro Ser 100 105 110 Thr Arg Ser Gly Trp Asp Asn Val ProAla Pro Ala Glu Pro Thr Tyr 115 120 125 Arg Ser Asn Val Asn Val Lys HisThr Phe Asp Asn Phe Val Glu Gly 130 135 140 Lys Ser Asn Gln Leu Ala ArgAla Ala Ala Arg Gln Val Ala Asp Asn 145 150 155 160 Pro Gly Gly Ala TyrAsn Pro Leu Phe Leu Tyr Gly Gly Thr Gly Leu 165 170 175 Gly Lys Thr HisLeu Leu His Ala Val Gly Asn Gly Ile Met Ala Arg 180 185 190 Lys Pro AsnAla Lys Val Val Tyr Met His Ser Glu Arg Phe Val Gln 195 200 205 Asp MetVal Lys Ala Leu Gln Asn Asn Ala Ile Glu Glu Phe Lys Arg 210 215 220 TyrTyr Arg Ser Val Asp Ala Leu Leu Ile Asp Asp Ile Gln Phe Phe 225 230 235240 Ala Asn Lys Glu Arg Ser Gln Glu Glu Phe Phe His Thr Phe Asn Ala 245250 255 Leu Leu Glu Gly Asn Gln Gln Ile Ile Leu Thr Ser Asp Arg Tyr Pro260 265 270 Lys Glu Ile Asn Gly Val Glu Asp Arg Leu Lys Ser Arg Phe GlyTrp 275 280 285 Gly Leu Thr Val Ala Ile Glu Pro Pro Glu Leu Glu Thr ArgVal Ala 290 295 300 Ile Leu Met Lys Lys Ala Asp Glu Asn Asp Ile Arg LeuPro Gly Glu 305 310 315 320 Val Ala Phe Phe Ile Ala Lys Arg Leu Arg SerAsn Val Arg Glu Leu 325 330 335 Glu Gly Ala Leu Asn Arg Val Ile Ala AsnAla Asn Phe Thr Gly Arg 340 345 350 Ala Ile Thr Ile Asp Phe Val Arg GluAla Leu Arg Asp Leu Leu Ala 355 360 365 Leu Gln Glu Lys Leu Val Thr IleAsp Asn Ile Gln Lys Thr Val Ala 370 375 380 Glu Tyr Tyr Lys Ile Lys ValAla Asp Leu Leu Ser Lys Arg Arg Ser 385 390 395 400 Arg Ser Val Ala ArgPro Arg Gln Met Ala Met Ala Leu Ala Lys Glu 405 410 415 Leu Thr Asn HisSer Leu Pro Glu Ile Gly Asp Ala Phe Gly Gly Arg 420 425 430 Asp His ThrThr Val Leu His Ala Cys Arg Lys Ile Glu Gln Leu Arg 435 440 445 Glu GluSer His Asp Ile Lys Glu Asp Phe Ser Asn Leu Ile Arg Thr 450 455 460 LeuSer Ser 465 102 440 PRT Thermatoga maritima 102 Met Lys Glu Arg Ile LeuGln Glu Ile Lys Thr Arg Val Asn Arg Lys 1 5 10 15 Ser Trp Glu Leu TrpPhe Ser Ser Phe Asp Val Lys Ser Ile Glu Gly 20 25 30 Asn Lys Val Val PheSer Val Gly Asn Leu Phe Ile Lys Glu Trp Leu 35 40 45 Glu Lys Lys Tyr TyrSer Val Leu Ser Lys Ala Val Lys Val Val Leu 50 55 60 Gly Asn Asp Ala ThrPhe Glu Ile Thr Tyr Glu Ala Phe Glu Pro His 65 70 75 80 Ser Ser Tyr SerGlu Pro Leu Val Lys Lys Arg Ala Val Leu Leu Thr 85 90 95 Pro Leu Asn ProAsp Tyr Thr Phe Glu Asn Phe Val Val Gly Pro Gly 100 105 110 Asn Ser PheAla Tyr His Ala Ala Leu Glu Val Ala Lys His Pro Gly 115 120 125 Arg TyrAsn Pro Leu Phe Ile Tyr Gly Gly Val Gly Leu Gly Lys Thr 130 135 140 HisLeu Leu Gln Ser Ile Gly Asn Tyr Val Val Gln Asn Glu Pro Asp 145 150 155160 Leu Arg Val Met Tyr Ile Thr Ser Glu Lys Phe Leu Asn Asp Leu Val 165170 175 Asp Ser Met Lys Glu Gly Lys Leu Asn Glu Phe Arg Glu Lys Tyr Arg180 185 190 Lys Lys Val Asp Ile Leu Leu Ile Asp Asp Val Gln Phe Leu IleGly 195 200 205 Lys Thr Gly Val Gln Thr Glu Leu Phe His Thr Phe Asn GluLeu His 210 215 220 Asp Ser Gly Lys Gln Ile Val Ile Cys Ser Asp Arg GluPro Gln Lys 225 230 235 240 Leu Ser Glu Phe Gln Asp Arg Leu Val Ser ArgPhe Gln Met Gly Leu 245 250 255 Val Ala Lys Leu Glu Pro Pro Asp Glu GluThr Arg Lys Ser Ile Ala 260 265 270 Arg Lys Met Leu Glu Ile Glu His GlyGlu Leu Pro Glu Glu Val Leu 275 280 285 Asn Phe Val Ala Glu Asn Val AspAsp Asn Leu Arg Arg Leu Arg Gly 290 295 300 Ala Ile Ile Lys Leu Leu ValTyr Lys Glu Thr Thr Gly Lys Glu Val 305 310 315 320 Asp Leu Lys Glu AlaIle Leu Leu Leu Lys Asp Phe Ile Lys Pro Asn 325 330 335 Arg Val Lys AlaMet Asp Pro Ile Asp Glu Leu Ile Glu Ile Val Ala 340 345 350 Lys Val ThrGly Val Pro Arg Glu Glu Ile Leu Ser Asn Ser Arg Asn 355 360 365 Val LysAla Leu Thr Ala Arg Arg Ile Gly Met Tyr Val Ala Lys Asn 370 375 380 TyrLeu Lys Ser Ser Leu Arg Thr Ile Ala Glu Lys Phe Asn Arg Ser 385 390 395400 His Pro Val Val Val Asp Ser Val Lys Lys Val Lys Asp Ser Leu Leu 405410 415 Lys Gly Asn Lys Gln Leu Lys Ala Leu Ile Asp Glu Val Ile Gly Glu420 425 430 Ile Ser Arg Arg Ala Leu Ser Gly 435 440 103 457 PRTHelicobacter pylori 103 Met Asp Thr Asn Asn Asn Ile Glu Lys Glu Ile LeuAla Leu Val Lys 1 5 10 15 Gln Asn Pro Lys Val Ser Leu Ile Glu Tyr GluAsn Tyr Phe Ser Gln 20 25 30 Leu Lys Tyr Asn Pro Asn Ala Ser Lys Ser AspIle Ala Phe Phe Tyr 35 40 45 Ala Pro Asn Gln Val Leu Cys Thr Thr Ile ThrAla Lys Tyr Gly Ala 50 55 60 Leu Leu Lys Glu Ile Leu Ser Gln Asn Lys ValGly Met His Leu Ala 65 70 75 80 His Ser Val Asp Val Arg Ile Glu Val AlaPro Lys Ile Gln Ile Asn 85 90 95 Ala Gln Ser Asn Ile Asn Tyr Lys Ala IleLys Thr Ser Val Lys Asp 100 105 110 Ser Tyr Thr Phe Glu Asn Phe Val ValGly Ser Cys Asn Asn Thr Val 115 120 125 Tyr Glu Ile Ala Lys Lys Val AlaGln Ser Asp Thr Pro Pro Tyr Asn 130 135 140 Pro Val Leu Phe Tyr Gly GlyThr Gly Leu Gly Lys Thr His Ile Leu 145 150 155 160 Asn Ala Ile Gly AsnHis Ala Leu Glu Lys His Lys Lys Val Val Leu 165 170 175 Val Thr Ser GluAsp Phe Leu Thr Asp Phe Leu Lys His Leu Asp Asn 180 185 190 Lys Thr MetAsp Ser Phe Lys Ala Lys Tyr Arg His Cys Asp Phe Phe 195 200 205 Leu LeuAsp Asp Ala Gln Phe Leu Gln Gly Lys Pro Lys Leu Glu Glu 210 215 220 GluPhe Phe His Thr Phe Asn Glu Leu His Ala Asn Ser Lys Gln Ile 225 230 235240 Val Leu Ile Ser Asp Arg Ser Pro Lys Asn Ile Ala Gly Leu Glu Asp 245250 255 Arg Leu Lys Ser Arg Phe Glu Trp Gly Ile Thr Ala Lys Val Met Pro260 265 270 Pro Asp Leu Glu Thr Lys Leu Ser Ile Val Lys Gln Lys Cys GlnLeu 275 280 285 Asn Gln Ile Thr Leu Pro Glu Glu Val Met Glu Tyr Ile AlaGln His 290 295 300 Ile Ser Asp Asn Ile Arg Gln Met Glu Gly Ala Ile IleLys Ile Ser 305 310 315 320 Val Asn Ala Asn Leu Met Asn Ala Ser Ile AspLeu Asn Leu Ala Lys 325 330 335 Thr Val Leu Glu Asp Leu Gln Lys Asp HisAla Glu Gly Ser Ser Leu 340 345 350 Glu Asn Ile Leu Leu Ala Val Ala GlnSer Leu Asn Leu Lys Ser Ser 355 360 365 Glu Ile Lys Val Ser Ser Arg GlnLys Asn Val Ala Leu Ala Arg Lys 370 375 380 Leu Val Val Tyr Phe Ala ArgLeu Tyr Thr Pro Asn Pro Thr Leu Ser 385 390 395 400 Leu Ala Gln Phe LeuAsp Leu Lys Asp His Ser Ser Ile Ser Lys Met 405 410 415 Tyr Ser Gly ValLys Lys Met Leu Glu Glu Glu Lys Ser Pro Phe Val 420 425 430 Leu Ser LeuArg Glu Glu Ile Lys Asn Arg Leu Asn Glu Leu Asn Asp 435 440 445 Lys LysThr Ala Phe Asn Ser Ser Glu 450 455 104 1305 DNA Thermus thermophilus104 gtgtcgcacg aggccgtctg gcaacacgtt ctggagcaca tccgccgcag catcaccgag 60gtggagttcc acacctggtt tgaaaggatc cgccccttgg ggatccggga cggggtgctg 120gagctcgccg tgcccacctc ctttgccctg gactggatcc ggcgccacta cgccggcctc 180atccaggagg gccctcggct cctcggggcc caggcgcccc ggtttgagct ccgggtggtg 240cccggggtcg tagtccagga ggacatcttc cagcccccgc cgagcccccc ggcccaagct 300caacccgaag atacctttaa aacttcgtgg tggggcccaa caactccatg gccccacggc 360ggcgccgtgg ccgtggccga gtcccccggc cgggcctaca accccctctt catctacggg 420ggccgtggcc tgggaaagac ctacctgatg cacgccgtgg gcccactccg tgcgaagcgc 480ttcccccaca tgagattaga gtacgtttcc acggaaactt tcaccaacga gctcatcaac 540cggccatccg cgagggaccg gatgacggag ttccgggagc ggtaccgctc cgtggacctc 600ctgctggtgg acgacgtcca gttcatcgcc ggaaaggagc gcacccagga ggagtttttc 660cacaccttca acgcccttta cgaggcccac aagcagatca tcctctcctc cgaccggccg 720cccaaggaca tcctcaccct ggaggcgcgc ctgcggagcc gctttgagtg gggcctgatc 780accgacaatc cagcccccga cctggaaacc cggatcgcca tcctgaagat gaacgccagc 840agcgggcctg aggatcccga ggacgccctg gagtacatcg cccggcaggt cacctccaac 900atccgggagt gggaaggggc cctcatgcgg gcatcgcctt tcgcctccct caacggcgtt 960gagctgaccc gcgccgtggc ggccaaggct ctccgacatc ttcgccccag ggagctggag 1020gcggacccct tggagatcat ccgcaaagcg gcgggaccag ttcggcctga aaccccggga 1080ggagctcacg gggagcgccg caagaaggag gtggtcctcc cccggcagct cgccatgtac 1140ctggtgcggg agctcacccc ggcctccctg cccgagatcg accagctcaa cgacgaccgg 1200gaccacacca cggtcctcta cgccatccag aaggtccagg agctcgcgga aagcgaccgg 1260gaggtgcagg gcctcctccg caccctccgg gaggcgtgca catga 1305 105 434 PRTThermus thermophilus 105 Val Ser His Glu Ala Val Trp Gln His Val Leu GluHis Ile Arg Arg 1 5 10 15 Ser Ile Thr Glu Val Glu Phe His Thr Trp PheGlu Arg Ile Arg Pro 20 25 30 Leu Gly Ile Arg Asp Gly Val Leu Glu Leu AlaVal Pro Thr Ser Phe 35 40 45 Ala Leu Asp Trp Ile Arg Arg His Tyr Ala GlyLeu Ile Gln Glu Gly 50 55 60 Pro Arg Leu Leu Gly Ala Gln Ala Pro Arg PheGlu Leu Arg Val Val 65 70 75 80 Pro Gly Val Val Val Gln Glu Asp Ile PheGln Pro Pro Pro Ser Pro 85 90 95 Pro Ala Gln Ala Gln Pro Glu Asp Thr PheLys Thr Ser Trp Trp Gly 100 105 110 Pro Thr Thr Pro Trp Pro His Gly GlyAla Val Ala Val Ala Glu Ser 115 120 125 Pro Gly Arg Ala Tyr Asn Pro LeuPhe Ile Tyr Gly Gly Arg Gly Leu 130 135 140 Gly Lys Thr Tyr Leu Met HisAla Val Gly Pro Leu Arg Ala Lys Arg 145 150 155 160 Phe Pro His Met ArgLeu Glu Tyr Val Ser Thr Glu Thr Phe Thr Asn 165 170 175 Glu Leu Ile AsnArg Pro Ser Ala Arg Asp Arg Met Thr Glu Phe Arg 180 185 190 Glu Arg TyrArg Ser Val Asp Leu Leu Leu Val Asp Asp Val Gln Phe 195 200 205 Ile AlaGly Lys Glu Arg Thr Gln Glu Glu Phe Phe His Thr Phe Asn 210 215 220 AlaLeu Tyr Glu Ala His Lys Gln Ile Ile Leu Ser Ser Asp Arg Pro 225 230 235240 Pro Lys Asp Ile Leu Thr Leu Glu Ala Arg Leu Arg Ser Arg Phe Glu 245250 255 Trp Gly Leu Ile Thr Asp Asn Pro Ala Pro Asp Leu Glu Thr Arg Ile260 265 270 Ala Ile Leu Lys Met Asn Ala Ser Ser Gly Pro Glu Asp Pro GluAsp 275 280 285 Ala Leu Glu Tyr Ile Ala Arg Gln Val Thr Ser Asn Ile ArgGlu Trp 290 295 300 Glu Gly Ala Leu Met Arg Ala Ser Pro Phe Ala Ser LeuAsn Gly Val 305 310 315 320 Glu Leu Thr Arg Ala Val Ala Ala Lys Ala LeuArg His Leu Arg Pro 325 330 335 Arg Glu Leu Glu Ala Asp Pro Leu Glu IleIle Arg Lys Ala Ala Gly 340 345 350 Pro Val Arg Pro Glu Thr Pro Gly GlyAla His Gly Glu Arg Arg Lys 355 360 365 Lys Glu Val Val Leu Pro Arg GlnLeu Ala Met Tyr Leu Val Arg Glu 370 375 380 Leu Thr Pro Ala Ser Leu ProGlu Ile Asp Gln Leu Asn Asp Asp Arg 385 390 395 400 Asp His Thr Thr ValLeu Tyr Ala Ile Gln Lys Val Gln Glu Leu Ala 405 410 415 Glu Ser Asp ArgGlu Val Gln Gly Leu Leu Arg Thr Leu Arg Glu Ala 420 425 430 Cys Thr 1061128 DNA Thermus thermophilus 106 atgaacataa cggttcccaa aaaactcctctcggaccagc tttccctcct ggagcgcatc 60 gtcccctcta gaagcgccaa ccccctctacacctacctgg ggctttacgc cgaggaaggg 120 gccttgatcc tcttcgggac caacggggaggtggacctcg aggtccgcct ccccgccgag 180 gcccaaagcc ttccccgggt gctcgtccccgcccagccct tcttccagct ggtgcggagc 240 cttcctgggg acctcgtggc cctcggcctcgcctcggagc cgggccaggg ggggcagctg 300 gagctctcct ccgggcgttt ccgcacccggctcagcctgg cccctgccga gggctacccc 360 gagcttctgg tgcccgaggg ggaggacaagggggccttcc ccctccggac gcggatgccc 420 tccggggagc tcgtcaaggc cttgacccacgtgcgctacg ccgcgagcaa cgaggagtac 480 cgggccatct tccgcggggt gcagctggagttctcccccc agggcttccg ggcggtggcc 540 tccgacgggt accgcctcgc cctctacgacctgcccctgc cccaagggtt ccaggccaag 600 gccgtggtcc ccgcccggag cgtggacgagatggtgcggg tcctgaaggg ggcggacggg 660 gccgaggccg tcctcgccct gggcgagggggtgttggccc tggccctcga gggcggaagc 720 ggggtccgga tggccctccg cctcatggaaggggagttcc ccgactacca gagggtcatc 780 ccccaggagt tcgccctcaa ggtccaggtggagggggagg ccctcaggga ggcggtgcgc 840 cgggtgagcg tcctctccga ccggcagaaccaccgggtgg acctcctttt ggaggaaggc 900 cggatcctcc tctccgccga gggggactacggcaaggggc aggaggaggt gcccgcccag 960 gtggaggggc cggacatggc cgtggcctacaacgcccgct acctcctcga ggccctcgcc 1020 cccgtggggg accgggccca cctgggcatctccgggccca cgagcccgag cctcatctgg 1080 ggggacgggg aggggtaccg ggcggtggtggtgcccctca gggtctag 1128 107 376 PRT Thermus thermophilus 107 Met AsnIle Thr Val Pro Lys Lys Leu Leu Ser Asp Gln Leu Ser Leu 1 5 10 15 LeuGlu Arg Ile Val Pro Ser Arg Ser Ala Asn Pro Leu Tyr Thr Tyr 20 25 30 LeuGly Leu Tyr Ala Glu Glu Gly Ala Leu Ile Leu Phe Gly Thr Asn 35 40 45 GlyGlu Val Asp Leu Glu Val Arg Leu Pro Ala Glu Ala Gln Ser Leu 50 55 60 ProArg Val Leu Val Pro Ala Gln Pro Phe Phe Gln Leu Val Arg Ser 65 70 75 80Leu Pro Gly Asp Leu Val Ala Leu Gly Leu Ala Ser Glu Pro Gly Gln 85 90 95Gly Gly Gln Leu Glu Leu Ser Ser Gly Arg Phe Arg Thr Arg Leu Ser 100 105110 Leu Ala Pro Ala Glu Gly Tyr Pro Glu Leu Leu Val Pro Glu Gly Glu 115120 125 Asp Lys Gly Ala Phe Pro Leu Arg Thr Arg Met Pro Ser Gly Glu Leu130 135 140 Val Lys Ala Leu Thr His Val Arg Tyr Ala Ala Ser Asn Glu GluTyr 145 150 155 160 Arg Ala Ile Phe Arg Gly Val Gln Leu Glu Phe Ser ProGln Gly Phe 165 170 175 Arg Ala Val Ala Ser Asp Gly Tyr Arg Leu Ala LeuTyr Asp Leu Pro 180 185 190 Leu Pro Gln Gly Phe Gln Ala Lys Ala Val ValPro Ala Arg Ser Val 195 200 205 Asp Glu Met Val Arg Val Leu Lys Gly AlaAsp Gly Ala Glu Ala Val 210 215 220 Leu Ala Leu Gly Glu Gly Val Leu AlaLeu Ala Leu Glu Gly Gly Ser 225 230 235 240 Gly Val Arg Met Ala Leu ArgLeu Met Glu Gly Glu Phe Pro Asp Tyr 245 250 255 Gln Arg Val Ile Pro GlnGlu Phe Ala Leu Lys Val Gln Val Glu Gly 260 265 270 Glu Ala Leu Arg GluAla Val Arg Arg Val Ser Val Leu Ser Asp Arg 275 280 285 Gln Asn His ArgVal Asp Leu Leu Leu Glu Glu Gly Arg Ile Leu Leu 290 295 300 Ser Ala GluGly Asp Tyr Gly Lys Gly Gln Glu Glu Val Pro Ala Gln 305 310 315 320 ValGlu Gly Pro Asp Met Ala Val Ala Tyr Asn Ala Arg Tyr Leu Leu 325 330 335Glu Ala Leu Ala Pro Val Gly Asp Arg Ala His Leu Gly Ile Ser Gly 340 345350 Pro Thr Ser Pro Ser Leu Ile Trp Gly Asp Gly Glu Gly Tyr Arg Ala 355360 365 Val Val Val Pro Leu Arg Val Glx 370 375 108 376 PRT Thermusthermophilus 108 Met Asn Ile Thr Val Pro Lys Lys Leu Leu Ser Asp Gln LeuSer Leu 1 5 10 15 Leu Glu Arg Ile Val Pro Ser Arg Ser Ala Asn Pro LeuTyr Thr Tyr 20 25 30 Leu Gly Leu Tyr Ala Glu Glu Gly Ala Leu Ile Leu PheGly Thr Asn 35 40 45 Gly Glu Val Asp Leu Glu Val Arg Leu Pro Ala Glu AlaGln Ser Leu 50 55 60 Pro Arg Val Leu Val Pro Ala Gln Pro Phe Phe Gln LeuVal Arg Ser 65 70 75 80 Leu Pro Gly Asp Leu Val Ala Leu Gly Leu Ala SerGlu Pro Gly Gln 85 90 95 Gly Gly Gln Leu Glu Leu Ser Ser Gly Arg Phe ArgThr Arg Leu Ser 100 105 110 Leu Ala Pro Ala Glu Gly Tyr Pro Glu Leu LeuVal Pro Glu Gly Glu 115 120 125 Asp Lys Gly Ala Phe Pro Leu Arg Thr ArgMet Pro Ser Gly Glu Leu 130 135 140 Val Lys Ala Leu Thr His Val Arg TyrAla Ala Ser Asn Glu Glu Tyr 145 150 155 160 Arg Ala Ile Phe Arg Gly ValGln Leu Glu Phe Ser Pro Gln Gly Phe 165 170 175 Arg Ala Val Ala Ser AspGly Tyr Arg Leu Ala Leu Tyr Asp Leu Pro 180 185 190 Leu Pro Gln Gly PheGln Ala Lys Ala Val Val Pro Ala Arg Ser Val 195 200 205 Asp Glu Met ValArg Val Leu Lys Gly Ala Asp Gly Ala Glu Ala Val 210 215 220 Leu Ala LeuGly Glu Gly Val Leu Ala Leu Ala Leu Glu Gly Gly Ser 225 230 235 240 GlyVal Arg Met Ala Leu Arg Leu Met Glu Gly Glu Phe Pro Asp Tyr 245 250 255Gln Arg Val Ile Pro Gln Glu Phe Ala Leu Lys Val Gln Val Glu Gly 260 265270 Glu Ala Leu Arg Glu Ala Val Arg Arg Val Ser Val Leu Ser Asp Arg 275280 285 Gln Asn His Arg Val Asp Leu Leu Leu Glu Glu Gly Arg Ile Leu Leu290 295 300 Ser Ala Glu Gly Asp Tyr Gly Lys Gly Gln Glu Glu Val Pro AlaGln 305 310 315 320 Val Glu Gly Pro Asp Met Ala Val Ala Tyr Asn Ala ArgTyr Leu Leu 325 330 335 Glu Ala Leu Ala Pro Val Gly Asp Arg Ala His LeuGly Ile Ser Gly 340 345 350 Pro Thr Ser Pro Ser Leu Ile Trp Gly Asp GlyGlu Gly Tyr Arg Ala 355 360 365 Val Val Val Pro Leu Arg Val Glx 370 375109 367 PRT Escherichia coli 109 Met Lys Phe Thr Val Glu Arg Glu His LeuLeu Lys Pro Leu Gln Gln 1 5 10 15 Val Ser Gly Pro Leu Gly Gly Arg ProThr Leu Pro Ile Leu Gly Asn 20 25 30 Leu Leu Leu Gln Val Ala Asp Gly ThrLeu Ser Leu Thr Gly Thr Asp 35 40 45 Leu Glu Met Glu Met Val Ala Arg ValAla Leu Val Gln Pro His Glu 50 55 60 Pro Gly Ala Thr Thr Val Pro Ala ArgLys Phe Phe Asp Ile Cys Arg 65 70 75 80 Gly Leu Pro Glu Gly Ala Glu IleAla Val Gln Leu Glu Gly Glu Arg 85 90 95 Met Leu Val Arg Ser Gly Arg SerArg Phe Ser Leu Ser Thr Leu Pro 100 105 110 Ala Ala Asp Phe Pro Asn LeuAsp Asp Trp Gln Ser Glu Val Glu Phe 115 120 125 Thr Leu Pro Gln Ala ThrMet Lys Arg Leu Ile Glu Ala Thr Gln Phe 130 135 140 Ser Met Ala His GlnAsp Val Arg Tyr Tyr Leu Asn Gly Met Leu Phe 145 150 155 160 Glu Thr GluGly Glu Glu Leu Arg Thr Val Ala Thr Asp Gly His Arg 165 170 175 Leu AlaVal Cys Ser Met Pro Ile Gly Gln Ser Leu Pro Ser His Ser 180 185 190 ValIle Val Pro Arg Lys Gly Val Ile Glu Leu Met Arg Met Leu Asp 195 200 205Gly Gly Asp Asn Pro Leu Arg Val Gln Ile Gly Ser Asn Asn Ile Arg 210 215220 Ala His Val Gly Asp Phe Ile Phe Thr Ser Lys Leu Val Asp Gly Arg 225230 235 240 Phe Pro Asp Tyr Arg Arg Val Leu Pro Lys Asn Pro Asp Lys HisLeu 245 250 255 Glu Ala Gly Cys Asp Leu Leu Lys Gln Ala Phe Ala Arg AlaAla Ile 260 265 270 Leu Ser Asn Glu Lys Phe Arg Gly Val Arg Leu Tyr ValSer Glu Asn 275 280 285 Gln Leu Lys Ile Thr Ala Asn Asn Pro Glu Gln GluGlu Ala Glu Glu 290 295 300 Ile Leu Asp Val Thr Tyr Ser Gly Ala Glu MetGlu Ile Gly Phe Asn 305 310 315 320 Val Ser Tyr Val Leu Asp Val Leu AsnAla Leu Lys Cys Glu Asn Val 325 330 335 Arg Met Met Leu Thr Asp Ser ValSer Ser Val Gln Ile Glu Asp Ala 340 345 350 Ala Ser Gln Ser Ala Ala TyrVal Val Met Pro Met Arg Leu Glx 355 360 365 110 367 PRT Proteusmirabilis 110 Met Lys Phe Ile Ile Glu Arg Glu Gln Leu Leu Lys Pro LeuGln Gln 1 5 10 15 Val Ser Gly Pro Leu Gly Gly Arg Pro Thr Leu Pro IleLeu Gly Asn 20 25 30 Leu Leu Leu Lys Val Thr Glu Asn Thr Leu Ser Leu ThrGly Thr Asp 35 40 45 Leu Glu Met Glu Met Met Ala Arg Val Ser Leu Ser GlnSer His Glu 50 55 60 Ile Gly Ala Thr Thr Val Pro Ala Arg Lys Phe Phe AspIle Trp Arg 65 70 75 80 Gly Leu Pro Glu Gly Ala Glu Ile Ser Val Glu LeuAsp Gly Asp Arg 85 90 95 Leu Leu Val Arg Ser Gly Arg Ser Arg Phe Ser LeuSer Thr Leu Pro 100 105 110 Ala Ser Asp Phe Pro Asn Leu Asp Asp Trp GlnSer Glu Val Glu Phe 115 120 125 Thr Leu Pro Gln Ala Thr Leu Lys Arg LeuIle Glu Ser Thr Gln Phe 130 135 140 Ser Met Ala His Gln Asp Val Arg TyrTyr Leu Asn Gly Met Leu Phe 145 150 155 160 Glu Thr Glu Asn Thr Glu LeuArg Thr Val Ala Thr Asp Gly His Arg 165 170 175 Leu Ala Val Cys Ala MetAsp Ile Gly Gln Ser Leu Pro Gly His Ser 180 185 190 Val Ile Val Pro ArgLys Gly Val Ile Glu Leu Met Arg Leu Leu Asp 195 200 205 Gly Ser Gly GluSer Leu Leu Gln Leu Gln Ile Gly Ser Asn Asn Leu 210 215 220 Arg Ala HisVal Gly Asp Phe Ile Phe Thr Ser Lys Leu Val Asp Gly 225 230 235 240 ArgPhe Pro Asp Tyr Arg Arg Val Leu Pro Lys Asn Pro Thr Lys Thr 245 250 255Val Ile Ala Gly Cys Asp Ile Leu Lys Gln Ala Phe Ser Arg Ala Ala 260 265270 Ile Leu Ser Asn Glu Lys Phe Arg Gly Val Arg Ile Asn Leu Thr Asn 275280 285 Gly Gln Leu Lys Ile Thr Ala Asn Asn Pro Glu Gln Glu Glu Ala Glu290 295 300 Glu Ile Val Asp Val Gln Tyr Gln Gly Glu Glu Met Glu Ile GlyPhe 305 310 315 320 Asn Val Ser Tyr Leu Leu Asp Val Leu Asn Thr Leu LysCys Glu Glu 325 330 335 Val Lys Leu Leu Leu Thr Asp Ala Val Ser Ser ValGln Val Glu Asn 340 345 350 Val Ala Ser Ala Ala Ala Ala Tyr Val Val MetPro Met Arg Leu 355 360 365 111 366 PRT Haemophilus influenzae 111 MetGln Phe Ser Ile Ser Arg Glu Asn Leu Leu Lys Pro Leu Gln Gln 1 5 10 15Val Cys Gly Val Leu Ser Asn Arg Pro Asn Ile Pro Val Leu Asn Asn 20 25 30Val Leu Leu Gln Ile Glu Asp Tyr Arg Leu Thr Ile Thr Gly Thr Asp 35 40 45Leu Glu Val Glu Leu Ser Ser Gln Thr Gln Leu Ser Ser Ser Ser Glu 50 55 60Asn Gly Thr Phe Thr Ile Pro Ala Lys Lys Phe Leu Asp Ile Cys Arg 65 70 7580 Thr Leu Ser Asp Asp Ser Glu Ile Thr Val Thr Phe Glu Gln Asp Arg 85 9095 Ala Leu Val Gln Ser Gly Arg Ser Arg Phe Thr Leu Ala Thr Gln Pro 100105 110 Ala Glu Glu Tyr Pro Asn Leu Thr Asp Trp Gln Ser Glu Val Asp Phe115 120 125 Glu Leu Pro Gln Asn Thr Leu Arg Arg Leu Ile Glu Ala Thr GlnPhe 130 135 140 Ser Met Ala Asn Gln Asp Ala Arg Tyr Phe Leu Asn Gly MetLys Phe 145 150 155 160 Glu Thr Glu Gly Asn Leu Leu Arg Thr Val Ala ThrAsp Gly His Arg 165 170 175 Leu Ala Val Cys Thr Ile Ser Leu Glu Gln GluLeu Gln Asn His Ser 180 185 190 Val Ile Leu Pro Arg Lys Gly Val Leu GluLeu Val Arg Leu Leu Glu 195 200 205 Thr Asn Asp Glu Pro Ala Arg Leu GlnIle Gly Thr Asn Asn Leu Arg 210 215 220 Val His Leu Lys Asn Thr Val PheThr Ser Lys Leu Ile Asp Gly Arg 225 230 235 240 Phe Pro Asp Tyr Arg ArgVal Leu Pro Arg Asn Ala Thr Lys Ile Val 245 250 255 Glu Gly Asn Trp GluMet Leu Lys Gln Ala Phe Ala Arg Ala Ser Ile 260 265 270 Leu Ser Asn GluArg Ala Arg Ser Val Arg Leu Ser Leu Lys Glu Asn 275 280 285 Gln Leu LysIle Thr Ala Ser Asn Thr Glu His Glu Glu Ala Glu Glu 290 295 300 Ile ValAsp Val Asn Tyr Asn Gly Glu Glu Leu Glu Val Gly Phe Asn 305 310 315 320Val Thr Tyr Ile Leu Asp Val Leu Asn Ala Leu Lys Cys Asn Gln Val 325 330335 Arg Met Cys Leu Thr Asp Ala Phe Ser Ser Cys Leu Ile Glu Asn Cys 340345 350 Glu Asp Ser Ser Cys Glu Tyr Val Ile Met Pro Met Arg Leu 355 360365 112 367 PRT Pseudomonas putida 112 Met His Phe Thr Ile Gln Arg GluAla Leu Leu Lys Pro Leu Gln Leu 1 5 10 15 Val Ala Gly Val Val Glu ArgArg Gln Thr Leu Pro Val Leu Ser Asn 20 25 30 Val Leu Leu Val Val Gln GlyGln Gln Leu Ser Leu Thr Gly Thr Asp 35 40 45 Leu Glu Val Glu Leu Val GlyArg Val Gln Leu Glu Glu Pro Ala Glu 50 55 60 Pro Gly Glu Ile Thr Val ProAla Arg Lys Leu Met Asp Ile Cys Lys 65 70 75 80 Ser Leu Pro Asn Asp AlaLeu Ile Asp Ile Lys Val Asp Glu Gln Lys 85 90 95 Leu Leu Val Lys Ala GlyArg Ser Arg Phe Thr Leu Ser Thr Leu Pro 100 105 110 Ala Asn Asp Phe ProThr Val Glu Glu Gly Pro Gly Ser Leu Thr Cys 115 120 125 Asn Leu Glu GlnSer Lys Leu Arg Arg Leu Ile Glu Arg Thr Ser Phe 130 135 140 Ala Met AlaGln Gln Asp Val Arg Tyr Tyr Leu Asn Gly Met Leu Leu 145 150 155 160 GluVal Ser Arg Asn Thr Leu Arg Ala Val Ser Thr Asp Gly His Arg 165 170 175Leu Ala Leu Cys Ser Met Ser Ala Pro Ile Glu Gln Glu Asp Arg His 180 185190 Gln Val Ile Val Pro Arg Lys Gly Ile Leu Glu Leu Ala Arg Leu Leu 195200 205 Thr Asp Pro Glu Gly Met Val Ser Ile Val Leu Gly Gln His His Ile210 215 220 Arg Ala Thr Thr Gly Glu Phe Thr Phe Thr Ser Lys Leu Val AspGly 225 230 235 240 Lys Phe Pro Asp Tyr Glu Arg Val Leu Pro Lys Gly GlyAsp Lys Leu 245 250 255 Val Val Gly Asp Arg Gln Ala Leu Arg Glu Ala PheSer Arg Thr Ala 260 265 270 Ile Leu Ser Asn Glu Lys Tyr Arg Gly Ile ArgLeu Gln Leu Ala Ala 275 280 285 Gly Gln Leu Lys Ile Gln Ala Asn Asn ProGlu Gln Glu Glu Ala Glu 290 295 300 Glu Glu Ile Ser Val Asp Tyr Glu GlySer Ser Leu Glu Ile Gly Phe 305 310 315 320 Asn Val Ser Tyr Leu Leu AspVal Leu Gly Val Met Thr Thr Glu Gln 325 330 335 Val Arg Leu Ile Leu SerAsp Ser Asn Ser Ser Ala Leu Leu Gln Glu 340 345 350 Ala Gly Asn Asp AspSer Ser Tyr Val Val Met Pro Met Arg Leu 355 360 365 113 366 PRT Buchneraaphidicola 113 Met Lys Phe Thr Ile Gln Asn Asp Ile Leu Thr Lys Asn LeuLys Lys 1 5 10 15 Ile Thr Arg Val Leu Val Lys Asn Ile Ser Phe Pro IleLeu Glu Asn 20 25 30 Ile Leu Ile Gln Val Glu Asp Gly Thr Leu Ser Leu ThrThr Thr Asn 35 40 45 Leu Glu Ile Glu Leu Ile Ser Lys Ile Glu Ile Ile ThrLys Tyr Ile 50 55 60 Pro Gly Lys Thr Thr Ile Ser Gly Arg Lys Ile Leu AsnIle Cys Arg 65 70 75 80 Thr Leu Ser Glu Lys Ser Lys Ile Lys Met Gln LeuLys Asn Lys Lys 85 90 95 Met Tyr Ile Ser Ser Glu Asn Ser Asn Tyr Ile LeuSer Thr Leu Ser 100 105 110 Ala Asp Thr Phe Pro Asn His Gln Asn Phe AspTyr Ile Ser Lys Phe 115 120 125 Asp Ile Ser Ser Asn Ile Leu Lys Glu MetIle Glu Lys Thr Glu Phe 130 135 140 Ser Met Gly Lys Gln Asp Val Arg TyrTyr Leu Asn Gly Met Leu Leu 145 150 155 160 Glu Lys Lys Asp Lys Phe LeuArg Ser Val Ala Thr Asp Gly Tyr Arg 165 170 175 Leu Ala Ile Ser Tyr ThrGln Leu Lys Lys Asp Ile Asn Phe Phe Ser 180 185 190 Ile Ile Ile Pro AsnLys Ala Val Met Glu Leu Leu Lys Leu Leu Asn 195 200 205 Thr Gln Pro GlnLeu Leu Asn Ile Leu Ile Gly Ser Asn Ser Ile Arg 210 215 220 Ile Tyr ThrLys Asn Leu Ile Phe Thr Thr Gln Leu Ile Glu Gly Glu 225 230 235 240 TyrPro Asp Tyr Lys Ser Val Leu Phe Lys Glu Lys Lys Asn Pro Ile 245 250 255Ile Thr Asn Ser Ile Leu Leu Lys Lys Ser Leu Leu Arg Val Ala Ile 260 265270 Leu Ala His Glu Lys Phe Cys Gly Ile Glu Ile Lys Ile Glu Asn Gly 275280 285 Lys Phe Lys Val Leu Ser Asp Asn Gln Glu Glu Glu Thr Ala Glu Asp290 295 300 Leu Phe Glu Ile Asp Tyr Phe Gly Glu Lys Ile Glu Ile Ser IleAsn 305 310 315 320 Val Tyr Tyr Leu Leu Asp Val Ile Asn Asn Ile Lys SerGlu Asn Ile 325 330 335 Ala Leu Phe Leu Asn Lys Ser Lys Ser Ser Ile GlnIle Glu Ala Glu 340 345 350 Asn Asn Ser Ser Asn Ala Tyr Val Val Met LeuLeu Lys Arg 355 360 365 114 39 DNA Artificial Sequence Description ofArtificial Sequence primer 114 gtgtggatcc tcgtccccct catgcgcgaccaggaaggg 39 115 27 DNA Artificial Sequence Description of ArtificialSequence primer 115 gtgtggatcc gtggtgacct tagccac 27 116 30 DNAArtificial Sequence Description of Artificial Sequence primer 116ttcgtgtccg aggaccttgt ggtccacaac 30 117 3514 DNA Aquifex aeolicus 117atgagtaagg atttcgtcca ccttcacctg cacacccagt tctcactcct ggacggggct 60ataaagatag acgagctcgt gaaaaaggca aaggagtatg gatacaaagc tgtcggaatg 120tcagaccacg gaaacctctt cggttcgtat aaattctaca aagccctgaa ggcggaagga 180attaagccca taatcggcat ggaagcctac tttaccacgg gttcgaggtt tgacagaaag 240actaaaacga gcgaggacaa cataaccgac aagtacaacc accacctcat acttatagca 300aaggacgaaa aggtctaaag aacttaatga agctctcaac cctcgcctac aaagaaggtt 360tttactacaa acccagaatt gattacgaac tccttgaaaa gtacggggag ggcctaatag 420cccttaccgc atgcctgaaa ggtgttccca cctactacgc ttctataaac gaagtgaaaa 480aggcggagga atgggtaaag aagttcaagg atatattcgg agatgacctt tatttagaac 540ttcaagcgaa caacattcca gaacaggaag tggcaaacag gaacttaata gagatagcca 600aaaagtacga tgtgaaactc atagcgacgc aggacgccca ctacctcaat cccgaagaca 660ggtacgccca cacggttctt atggcacttc aaatgaaaaa gaccattcac gaactgagtt 720cgggaaactt caagtgttca aacgaagacc ttcactttgc tccacccgag tacatgtgga 780aaaagtttga aggtaagttc gaaggctggg aaaaggcact cctgaacact ctcgaggtaa 840tggaaaagac agcggacagc tttgagatat ttgaaaactc cacctacctc cttcccaagt 900acgacgttcc gcccgacaaa acccttgagg aatacctcag agaactcgcg tacaaaggtt 960taagacagag gatagaaagg ggacaagcta aggatactaa agagtactgg gagaggctcg 1020agtacgaact ggaagttata aacaaaatgg gctttgcggg atacttcttg atagttcagg 1080acttcataaa ctgggctaag aaaaacgaca tacctgttgg acccggaagg ggaagtgctg 1140gaggttccct cgtcgcatac gccatcggaa taacggacgt tgaccctata aagcacggat 1200tcctttttga gaggttctta aaccccgaaa gggtttccat gccggatata gacgtggatt 1260tctgtcagga caacagggaa aaggtcatag agtacgtaag gaacaagtac ggacacgaca 1320acgtagctca gataatcacc tacaacgtaa tgaaggcgaa gcaaacactg agagacgtcg 1380caagggccat gggactcccc tactccaccg cggacaaact cgcaaaactc attcctcagg 1440gggacgttca gggaacgtgg ctcagtctgg aagagatgta caaaacgcct gtggaggaac 1500tccttcagaa gtacggagaa cacagaacgg acatagagga caacgtaaag aagttcagac 1560agatatgcga agaaagtccg gagataaaac agctcgttga gacggccctg aagcttgaag 1620gtctcacgag acacacctcc ctccacgccg cgggagtggt tatagcacca aagcccttga 1680gcgagctcgt tcccctctac tacgataaag agggcgaagt cgcaacccag tacgacatgg 1740ttcagctcga agaactcggt ctcctgaaga tggacttcct cggactcaaa accctcacag 1800aactgaaact catgaaagaa ctcataaagg aaagacacgg agtggatata aacttccttg 1860aacttcccct tgacgacccg aaagtttaca aactccttca ggaaggaaaa accacgggag 1920tgttccagct cgaaagcagg ggaatgaaag aactcctgaa gaaactaaag cccgacagct 1980ttgacgacat cgttgcggtc ctcgcactct acagacccgg acctctaaag agcggactcg 2040ttgacacata cattaagaga aagcacggaa aagaacccgt tgagtacccc ttcccggagc 2100ttgaacccgt ccttaaggaa acctacggag taatcgttta tcaggaacag gtgatgaaga 2160tgtctcagat actttccggc tttactcccg gagaggcgga taccctcaga aaggcgatag 2220gtaagaagaa agcggattta atggctcaga tgaaagacaa gttcatacag ggagcggtgg 2280aaaggggata ccctgaagaa aagataagga agctctggga agacatagag aagttcgctt 2340cctactcctt caacaagtct cactcggtag cttacgggta catctcctac tggaccgcct 2400acgttaaagc ccactatccc gcggagttct tcgcggtaaa actcacaact gaaaagaacg 2460acaacaagtt cctcaacctc ataaaagacg ctaaactctt cggatttgag atacttcccc 2520ccgacataaa caagagtgat gtaggattta cgatagaagg tgaaaacagg ataaggttcg 2580ggcttgcgag gataaaggga gtgggagagg aaactgctaa gataatcgtt gaagctagaa 2640agaagtataa gcagttcaaa gggcttgcgg acttcataaa caaaaccaag aacaggaaga 2700taaacaagaa agtcgtggaa gcactcgtaa aggcaggggc ttttgacttt actaagaaaa 2760agaggaaaga actactcgct aaagtggcaa actctgaaaa agcattaatg gctacacaaa 2820actccctttt cggtgcaccg aaagaagaag tggaagaact cgacccctta aagcttgaaa 2880aggaagttct cggtttttac atttcagggc acccccttga caactacgaa aagctcctca 2940agaaccgcta cacacccatt gaagatttag aagagtggga caaggaaagc gaagcggtgc 3000ttacaggagt tatcacggaa ctcaaagtaa aaaagacgaa aaacggagat tacatggcgg 3060tcttcaacct cgttgacaag acgggactaa tagagtgtgt cgtcttcccg ggagtttacg 3120aagaggcaaa ggaactgata gaagaggaca gagtagtggt agtcaaaggt tttctggacg 3180aggaccttga aacggaaaat gtcaagttcg tggtgaaaga ggttttctcc cctgaggagt 3240tcgcaaagga gatgaggaat accctttata tattcttaaa aagagagcaa gccctaaacg 3300gcgttgccga aaaactaaag ggaattattg aaaacaacag gacggaggac ggatacaact 3360tggttctcac ggttgatctg ggagactact tcgttgattt agcactccca caagatatga 3420aactaaaggc tgacagaaag gttgtagagg agatagaaaa actgggagtg aaggtcataa 3480tttagtaaat aacccttact tccgagtagt cccc 3514 118 1161 PRT Aquifex aeolicus118 Met Ser Lys Asp Phe Val His Leu His Leu His Thr Gln Phe Ser Leu 1 510 15 Leu Asp Gly Ala Ile Lys Ile Asp Glu Leu Val Lys Lys Ala Lys Glu 2025 30 Tyr Gly Tyr Lys Ala Val Gly Met Ser Asp His Gly Asn Leu Phe Gly 3540 45 Ser Tyr Lys Phe Tyr Lys Ala Leu Lys Ala Glu Gly Ile Lys Pro Ile 5055 60 Ile Gly Met Glu Ala Tyr Phe Thr Thr Gly Ser Arg Phe Asp Arg Lys 6570 75 80 Thr Lys Thr Ser Glu Asp Asn Ile Thr Asp Lys Tyr Asn His His Leu85 90 95 Ile Leu Ile Ala Lys Asp Asp Lys Gly Leu Lys Asn Leu Met Lys Leu100 105 110 Ser Thr Leu Ala Tyr Lys Glu Gly Phe Tyr Tyr Lys Pro Arg IleAsp 115 120 125 Tyr Glu Leu Leu Glu Lys Tyr Gly Glu Gly Leu Ile Ala LeuThr Ala 130 135 140 Cys Leu Lys Gly Val Pro Thr Tyr Tyr Ala Ser Ile AsnGlu Val Lys 145 150 155 160 Lys Ala Glu Glu Trp Val Lys Lys Phe Lys AspIle Phe Gly Asp Asp 165 170 175 Leu Tyr Leu Glu Leu Gln Ala Asn Asn IlePro Glu Gln Glu Val Ala 180 185 190 Asn Arg Asn Leu Ile Glu Ile Ala LysLys Tyr Asp Val Lys Leu Ile 195 200 205 Ala Thr Gln Asp Ala His Tyr LeuAsn Pro Glu Asp Arg Tyr Ala His 210 215 220 Thr Val Leu Met Ala Leu GlnMet Lys Lys Thr Ile His Glu Leu Ser 225 230 235 240 Ser Gly Asn Phe LysCys Ser Asn Glu Asp Leu His Phe Ala Pro Pro 245 250 255 Glu Tyr Met TrpLys Lys Phe Glu Gly Lys Phe Glu Gly Trp Glu Lys 260 265 270 Ala Leu LeuAsn Thr Leu Glu Val Met Glu Lys Thr Ala Asp Ser Phe 275 280 285 Glu IlePhe Glu Asn Ser Thr Tyr Leu Leu Pro Lys Tyr Asp Val Pro 290 295 300 ProAsp Lys Thr Leu Glu Glu Tyr Leu Arg Glu Leu Ala Tyr Lys Gly 305 310 315320 Leu Arg Gln Arg Ile Glu Arg Gly Gln Ala Lys Asp Thr Lys Glu Tyr 325330 335 Trp Glu Arg Leu Glu Tyr Glu Leu Glu Val Ile Asn Lys Met Gly Phe340 345 350 Ala Gly Tyr Phe Leu Ile Val Gln Asp Phe Ile Asn Trp Ala LysLys 355 360 365 Asn Asp Ile Pro Val Gly Pro Gly Arg Gly Ser Ala Gly GlySer Leu 370 375 380 Val Ala Tyr Ala Ile Gly Ile Thr Asp Val Asp Pro IleLys His Gly 385 390 395 400 Phe Leu Phe Glu Arg Phe Leu Asn Pro Glu ArgVal Ser Met Pro Asp 405 410 415 Ile Asp Val Asp Phe Cys Gln Asp Asn ArgGlu Lys Val Ile Glu Tyr 420 425 430 Val Arg Asn Lys Tyr Gly His Asp AsnVal Ala Gln Ile Ile Thr Tyr 435 440 445 Asn Val Met Lys Ala Lys Gln ThrLeu Arg Asp Val Ala Arg Ala Met 450 455 460 Gly Leu Pro Tyr Ser Thr AlaAsp Lys Leu Ala Lys Leu Ile Pro Gln 465 470 475 480 Gly Asp Val Gln GlyThr Trp Leu Ser Leu Glu Glu Met Tyr Lys Thr 485 490 495 Pro Val Glu GluLeu Leu Gln Lys Tyr Gly Glu His Arg Thr Asp Ile 500 505 510 Glu Asp AsnVal Lys Lys Phe Arg Gln Ile Cys Glu Glu Ser Pro Glu 515 520 525 Ile LysGln Leu Val Glu Thr Ala Leu Lys Leu Glu Gly Leu Thr Arg 530 535 540 HisThr Ser Leu His Ala Ala Gly Val Val Ile Ala Pro Lys Pro Leu 545 550 555560 Ser Glu Leu Val Pro Leu Tyr Tyr Asp Lys Glu Gly Glu Val Ala Thr 565570 575 Gln Tyr Asp Met Val Gln Leu Glu Glu Leu Gly Leu Leu Lys Met Asp580 585 590 Phe Leu Gly Leu Lys Thr Leu Thr Glu Leu Lys Leu Met Lys GluLeu 595 600 605 Ile Lys Glu Arg His Gly Val Asp Ile Asn Phe Leu Glu LeuPro Leu 610 615 620 Asp Asp Pro Lys Val Tyr Lys Leu Leu Gln Glu Gly LysThr Thr Gly 625 630 635 640 Val Phe Gln Leu Glu Ser Arg Gly Met Lys GluLeu Leu Lys Lys Leu 645 650 655 Lys Pro Asp Ser Phe Asp Asp Ile Val AlaVal Leu Ala Leu Tyr Arg 660 665 670 Pro Gly Pro Leu Lys Ser Gly Leu ValAsp Thr Tyr Ile Lys Arg Lys 675 680 685 His Gly Lys Glu Pro Val Glu TyrPro Phe Pro Glu Leu Glu Pro Val 690 695 700 Leu Lys Glu Thr Tyr Gly ValIle Val Tyr Gln Glu Gln Val Met Lys 705 710 715 720 Met Ser Gln Ile LeuSer Gly Phe Thr Pro Gly Glu Ala Asp Thr Leu 725 730 735 Arg Lys Ala IleGly Lys Lys Lys Ala Asp Leu Met Ala Gln Met Lys 740 745 750 Asp Lys PheIle Gln Gly Ala Val Glu Arg Gly Tyr Pro Glu Glu Lys 755 760 765 Ile ArgLys Leu Trp Glu Asp Ile Glu Lys Phe Ala Ser Tyr Ser Phe 770 775 780 AsnLys Ser His Ser Val Ala Tyr Gly Tyr Ile Ser Tyr Trp Thr Ala 785 790 795800 Tyr Val Lys Ala His Tyr Pro Ala Glu Phe Phe Ala Val Lys Leu Thr 805810 815 Thr Glu Lys Asn Asp Asn Lys Phe Leu Asn Leu Ile Lys Asp Ala Lys820 825 830 Leu Phe Gly Phe Glu Ile Leu Pro Pro Asp Ile Asn Lys Ser AspVal 835 840 845 Gly Phe Thr Ile Glu Gly Glu Asn Arg Ile Arg Phe Gly LeuAla Arg 850 855 860 Ile Lys Gly Val Gly Glu Glu Thr Ala Lys Ile Ile ValGlu Ala Arg 865 870 875 880 Lys Lys Tyr Lys Gln Phe Lys Gly Leu Ala AspPhe Ile Asn Lys Thr 885 890 895 Lys Asn Arg Lys Ile Asn Lys Lys Val ValGlu Ala Leu Val Lys Ala 900 905 910 Gly Ala Phe Asp Phe Thr Lys Lys LysArg Lys Glu Leu Leu Ala Lys 915 920 925 Val Ala Asn Ser Glu Lys Ala LeuMet Ala Thr Gln Asn Ser Leu Phe 930 935 940 Gly Ala Pro Lys Glu Glu ValGlu Glu Leu Asp Pro Leu Lys Leu Glu 945 950 955 960 Lys Glu Val Leu GlyPhe Tyr Ile Ser Gly His Pro Leu Asp Asn Tyr 965 970 975 Glu Lys Leu LeuLys Asn Arg Tyr Thr Pro Ile Glu Asp Leu Glu Glu 980 985 990 Trp Asp LysGlu Ser Glu Ala Val Leu Thr Gly Val Ile Thr Glu Leu 995 1000 1005 LysVal Lys Lys Thr Lys Asn Gly Asp Tyr Met Ala Val Phe Asn Leu 1010 10151020 Val Asp Lys Thr Gly Leu Ile Glu Cys Val Val Phe Pro Gly Val Tyr1025 1030 1035 1040 Glu Glu Ala Lys Glu Leu Ile Glu Glu Asp Arg Val ValVal Val Lys 1045 1050 1055 Gly Phe Leu Asp Glu Asp Leu Glu Thr Glu AsnVal Lys Phe Val Val 1060 1065 1070 Lys Glu Val Phe Ser Pro Glu Glu PheAla Lys Glu Met Arg Asn Thr 1075 1080 1085 Leu Tyr Ile Phe Leu Lys ArgGlu Gln Ala Leu Asn Gly Val Ala Glu 1090 1095 1100 Lys Leu Lys Gly IleIle Glu Asn Asn Arg Thr Glu Asp Gly Tyr Asn 1105 1110 1115 1120 Leu ValLeu Thr Val Asp Leu Gly Asp Tyr Phe Val Asp Leu Ala Leu 1125 1130 1135Pro Gln Asp Met Lys Leu Lys Ala Asp Arg Lys Val Val Glu Glu Ile 11401145 1150 Glu Lys Leu Gly Val Lys Val Ile Ile 1155 1160 119 2408 DNAAquifex aeolicus 119 atgaactacg ttcccttcgc gagaaagtac agaccgaaattcttcaggga agtaatagga 60 caggaagctc ccgtaaggat actcaaaaac gctataaaaaacgacagagt ggctcacgcc 120 tacctctttg ccggaccgag gggggttggg aagacgactattgcaagaat tctcgcaaaa 180 gctttgaact gtaaaaatcc ctccaaaggt gagccctgcggtgagtgcga aaactgcagg 240 gagatagaca ggggtgtgtt ccctgactta attgaaatggatgccgcctc aaacaggggt 300 atagacgacg taagggcatt aaaagaagcg gtcaattacaaacctataaa aggaaagtac 360 aaggtttaca taatagacga agctcacatg ctcacgaaagaagctttcaa cgctctctta 420 aaaaccctcg aagagccccc tcccagaact gttttcgtcctttgtaccac ggagtacgac 480 aaaattcttc ccacgatact ctcaaggtgt cagaggataatcttctcaaa ggtaagaaag 540 gaaaaagtaa tagagtatct aaaaaagata tgtgaaaaggaagggattga gtgcgaagag 600 ggagcccttg aggttctggc tcatgcctct gaagggtgcatgagggatgc agcctctctc 660 ctggaccagg cgagcgttta cggggaaggc agggtaacaaaagaagtagt ggagaacttc 720 ctcggaattc tcagtcagga aagcgttagg agttttctgaaattgcttct gaactcagaa 780 gtggacgaag ctataaagtt cctcagagaa ctctcagaaaagggctacaa cctgaccaag 840 ttttgggaga tgttagaaga ggaagtgaga aacgcaattttagtaaagag cctgaaaaat 900 cccgaaagcg tggttcagaa ctggcaggat tacgaagacttcaaagacta ccctctggaa 960 gccctcctct acgttgagaa cctgataaac aggggtaaagttgaagcgag aacgagagaa 1020 cccttaagag cctttgaact cgcggtaata aagagccttatagtcaaaga cataattccc 1080 gtatcccagc tcggaagtgt ggtaaaggaa accaaaaaggaagaaaagaa agttgaagta 1140 aaagaagagc caaaagtaaa agaagaaaaa ccaaaggagcaggaagagga caggttccag 1200 aaagttttaa acgctgtgga cggcaaaatc cttaaaagaatacttgaagg ggcaaaaagg 1260 gaagaaagag acggaaaaat cgtcctaaag atagaagcctcttatctgag aaccatgaaa 1320 aaggaatttg actcactaaa ggagactttt ccttttttagagtttgaacc cgtggaggat 1380 aaaaaaaaac ctcagaagtc cagcgggacg aggctgttttaaaggtaaag gagctcttca 1440 atgcaaaaat actcaaagta cgaagtaaaa gctaaggtcataaaggtgag aatgcccgtg 1500 gaagagatag ggctgtttaa cgcactaata gacggcttgcccaggtacgc actcacgagg 1560 acgaaggaaa agggaaaggg agaagttttc gttttagcgactccttataa agtcaaggaa 1620 ttgatggaag ctatggaggg tatgaaaaaa cacataaaggatttagaaat cctcggagag 1680 acggatgagg atttaacttt ttaaagtatg ggtgtatctgagcaaaggtt taagctaaaa 1740 acaaacctga aacccgcagg ggaccagccg aaagccataaaaaaactcct tgaaaaccta 1800 aggaaaggcg taaaagaaca aacacttctc ggagtcacgggaagcggaaa gacttttact 1860 ctagcaaacg taatagcgaa gtacaacaaa ccaactcttgtggtagttca caacaaaatt 1920 ctcgcggcac agctatacag ggagtttaaa gaactattccctgaaaacgc tgtagagtac 1980 tttgtctctt actacgacta ttaccaacct gaagcctacattcccgaaaa agatttatac 2040 atagaaaagg acgcgagtat aaacgaaagc tggaacgtttcagacactcc gccacgatat 2100 ccgttctaga aaggagggac gttatagtag ttgcttcagtttcttgcata tacggactcg 2160 ggaaacctga gcactacgaa aacctgagga taaaactccaaaggggaata agactgaact 2220 tgagtaagct cctgaggaaa ctcgttgagc taggatatcagagaaatgac tttgccataa 2280 agagggctac cttctcggtt aggggagacg tggttgagatagtcccttct cacacggaag 2340 attacctcgt gagggtagag ttctgggacg acgaagttgaaagaatagtc ctcatggacg 2400 ctctgaac 2408 120 473 PRT Aquifex aeolicus120 Met Asn Tyr Val Pro Phe Ala Arg Lys Tyr Arg Pro Lys Phe Phe Arg 1 510 15 Glu Val Ile Gly Gln Glu Ala Pro Val Arg Ile Leu Lys Asn Ala Ile 2025 30 Lys Asn Asp Arg Val Ala His Ala Tyr Leu Phe Ala Gly Pro Arg Gly 3540 45 Val Gly Lys Thr Thr Ile Ala Arg Ile Leu Ala Lys Ala Leu Asn Cys 5055 60 Lys Asn Pro Ser Lys Gly Glu Pro Cys Gly Glu Cys Glu Asn Cys Arg 6570 75 80 Glu Ile Asp Arg Gly Val Phe Pro Asp Leu Ile Glu Met Asp Ala Ala85 90 95 Ser Asn Arg Gly Ile Asp Asp Val Arg Ala Leu Lys Glu Ala Val Asn100 105 110 Tyr Lys Pro Ile Lys Gly Lys Tyr Lys Val Tyr Ile Ile Asp GluAla 115 120 125 His Met Leu Thr Lys Glu Ala Phe Asn Ala Leu Leu Lys ThrLeu Glu 130 135 140 Glu Pro Pro Pro Arg Thr Val Phe Val Leu Cys Thr ThrGlu Tyr Asp 145 150 155 160 Lys Ile Leu Pro Thr Ile Leu Ser Arg Cys GlnArg Ile Ile Phe Ser 165 170 175 Lys Val Arg Lys Glu Lys Val Ile Glu TyrLeu Lys Lys Ile Cys Glu 180 185 190 Lys Glu Gly Ile Glu Cys Glu Glu GlyAla Leu Glu Val Leu Ala His 195 200 205 Ala Ser Glu Gly Cys Met Arg AspAla Ala Ser Leu Leu Asp Gln Ala 210 215 220 Ser Val Tyr Gly Glu Gly ArgVal Thr Lys Glu Val Val Glu Asn Phe 225 230 235 240 Leu Gly Ile Leu SerGln Glu Ser Val Arg Ser Phe Leu Lys Leu Leu 245 250 255 Leu Asn Ser GluVal Asp Glu Ala Ile Lys Phe Leu Arg Glu Leu Ser 260 265 270 Glu Lys GlyTyr Asn Leu Thr Lys Phe Trp Glu Met Leu Glu Glu Glu 275 280 285 Val ArgAsn Ala Ile Leu Val Lys Ser Leu Lys Asn Pro Glu Ser Val 290 295 300 ValGln Asn Trp Gln Asp Tyr Glu Asp Phe Lys Asp Tyr Pro Leu Glu 305 310 315320 Ala Leu Leu Tyr Val Glu Asn Leu Ile Asn Arg Gly Lys Val Glu Ala 325330 335 Arg Thr Arg Glu Pro Leu Arg Ala Phe Glu Leu Ala Val Ile Lys Ser340 345 350 Leu Ile Val Lys Asp Ile Ile Pro Val Ser Gln Leu Gly Ser ValVal 355 360 365 Lys Glu Thr Lys Lys Glu Glu Lys Lys Val Glu Val Lys GluGlu Pro 370 375 380 Lys Val Lys Glu Glu Lys Pro Lys Glu Gln Glu Glu AspArg Phe Gln 385 390 395 400 Lys Val Leu Asn Ala Val Asp Gly Lys Ile LeuLys Arg Ile Leu Glu 405 410 415 Gly Ala Lys Arg Glu Glu Arg Asp Gly LysIle Val Leu Lys Ile Glu 420 425 430 Ala Ser Tyr Leu Arg Thr Met Lys LysGlu Phe Asp Ser Leu Lys Glu 435 440 445 Thr Phe Pro Phe Leu Glu Phe GluPro Val Glu Asp Lys Lys Lys Pro 450 455 460 Gln Lys Ser Ser Gly Thr ArgLeu Phe 465 470 121 1090 DNA Aquifex aeolicus 121 atgcgcgtta aggtggacagggaggagctt gaagaggttc ttaaaaaagc aagagaaagc 60 acggaaaaaa aagccgcactcccgatactc gcgaacttct tactctccgc aaaagaggaa 120 aacttaatcg taagggcaacggacttggaa aactaccttg tagtctccgt aaagggggag 180 gttgaagagg aaggagaggtttgcgtccac tctcaaaaac tctacgatat agtcaagaac 240 ttaaattccg cttacgtttaccttcatacg gaaggtgaaa aactcgtcat aacgggagga 300 aagagtacgt acaaacttccgacagctccc gcggaggact ttcccgaatt tccagaaatc 360 gtagaaggag gagaaacactttcgggaaac cttctcgtta acggaataga aaaggtagag 420 tacgccatag cgaaggaagaagcgaacata gcccttcagg gaatgtatct gagaggatac 480 gaggacagaa ttcactttgtgttcggacgg tcacaggctt gcactttatg aacctctacg 540 taaacattga aaagagtgaagacgagtctt ttgcttactt ctccactccc gagtggaaac 600 tcgccgttag ctcctggaaggagaattccc ggactacatg agtgtcatcc ctgaggagtt 660 ttcggcggaa gtcttgtttgagacagagga agtcttaaag gttttaaaga ggttgaaggc 720 tttaagcgaa ggaaaagtttttcccgtgaa gattacctta agcgaaaacc ttgccatctt 780 tgagttcgcg gatccggagttcggagaagc gagagaggaa attgaagtgg agtacacggg 840 agagcccttt gagataggattcaacggaaa taccttatgg aggcgcttga cgcctacgac 900 agcgaaagag tgtggttcaagttcacaacc cccgacacgg ccactttatt ggaggctgaa 960 gattacgaaa aggaaccttacaagtgcata ataatgccga tgagggtgta gccatgaaaa 1020 aagctttaat ctttttattgagcttgagcc ttttaattcc tgcgtttagc gaagccaaac 1080 ccaagtcttc 1090 122 363PRT Aquifex aeolicus 122 Met Arg Val Lys Val Asp Arg Glu Glu Leu Glu GluVal Leu Lys Lys 1 5 10 15 Ala Arg Glu Ser Thr Glu Lys Lys Ala Ala LeuPro Ile Leu Ala Asn 20 25 30 Phe Leu Leu Ser Ala Lys Glu Glu Asn Leu IleVal Arg Ala Thr Asp 35 40 45 Leu Glu Asn Tyr Leu Val Val Ser Val Lys GlyGlu Val Glu Glu Glu 50 55 60 Gly Glu Val Cys Val His Ser Gln Lys Leu TyrAsp Ile Val Lys Asn 65 70 75 80 Leu Asn Ser Ala Tyr Val Tyr Leu His ThrGlu Gly Glu Lys Leu Val 85 90 95 Ile Thr Gly Gly Lys Ser Thr Tyr Lys LeuPro Thr Ala Pro Ala Glu 100 105 110 Asp Phe Pro Glu Phe Pro Glu Ile ValGlu Gly Gly Glu Thr Leu Ser 115 120 125 Gly Asn Leu Leu Val Asn Gly IleGlu Lys Val Glu Tyr Ala Ile Ala 130 135 140 Lys Glu Glu Ala Asn Ile AlaLeu Gln Gly Met Tyr Leu Arg Gly Tyr 145 150 155 160 Glu Asp Arg Ile HisPhe Val Gly Ser Asp Gly His Arg Leu Ala Leu 165 170 175 Tyr Glu Pro LeuGly Glu Phe Ser Lys Glu Leu Leu Ile Pro Arg Lys 180 185 190 Ser Leu LysVal Leu Lys Lys Leu Ile Thr Gly Ile Glu Asp Val Asn 195 200 205 Ile GluLys Ser Glu Asp Glu Ser Phe Ala Tyr Phe Ser Thr Pro Glu 210 215 220 TrpLys Leu Ala Val Arg Leu Leu Glu Gly Glu Phe Pro Asp Tyr Met 225 230 235240 Ser Val Ile Pro Glu Glu Phe Ser Ala Glu Val Leu Phe Glu Thr Glu 245250 255 Glu Val Leu Lys Val Leu Lys Arg Leu Lys Ala Leu Ser Glu Gly Lys260 265 270 Val Phe Pro Val Lys Ile Thr Leu Ser Glu Asn Leu Ala Ile PheGlu 275 280 285 Phe Ala Asp Pro Glu Phe Gly Glu Ala Arg Glu Glu Ile GluVal Glu 290 295 300 Tyr Thr Gly Glu Pro Phe Glu Ile Gly Phe Asn Gly LysTyr Leu Met 305 310 315 320 Glu Ala Leu Asp Ala Tyr Asp Ser Glu Arg ValTrp Phe Lys Phe Thr 325 330 335 Thr Pro Asp Thr Ala Thr Leu Leu Glu AlaGlu Asp Tyr Glu Lys Glu 340 345 350 Pro Tyr Lys Cys Ile Ile Met Pro MetArg Val 355 360 123 1093 DNA Aquifex aeolicus 123 gtggaaacca caatattccagttccagaaa acttttttca caaaacctcc gaaggagagg 60 gtcttcgtcc ttcatggagaagagcagtat ctcataagaa cctttttgtc taagctgaag 120 gaaaagtacg gggagaattacacggttctg tggggggatg agataagcga ggaggaattc 180 tacactgccc tttccgagaccagtatattc ggcggttcaa aggaaaaagc ggtggtcatt 240 tacaacttcg gggatttcctgaagaagctc ggaaggaaga aaaaggaaaa agaaaggctt 300 ataaaagtcc tcagaaacgtaaagagtaac tacgtattta tagtgtacga tgcgaaactc 360 cagaaacagg aactttcttcggaacctctg aaatccgtag cgtctttcgg cggtatagtg 420 gtagcaaaca ggctgagcaaggagaggata aaacagctcg tccttaagaa gttcaaagaa 480 aaagggataa acgtagaaaacgatgccctt gaataccttc tccagctcac gggttacaac 540 ttgatggagc tcaaacttgaggttgaaaaa ctgatagatt acgcaagtga aaagaaaatt 600 ttaacactcg atgaggtaaagagagtagcc ttctcagtct cagaaaacgt aaacgtattt 660 gagttcgttg atttactcctcttaaaagat tacgaaaagg ctcttaaagt tttggactcc 720 ctcatttcct tcggaatacaccccctccag attatgaaaa tcctgtcctc ctatgctcta 780 aaactttaca ccctcaagaggcttgaagag aagggagagg acctgaataa ggcgatggaa 840 agcgtgggaa taaagaacaactttctcaag atgaagttca aatcttactt aaaggcaaac 900 tctaaagagg acttgaagaacctaatcctc tccctccaga ggatagacgc tttttctaaa 960 ctttactttc aggacacagtgcagttgctg gggatttctt gacctcaaga ctggagaggg 1020 aagttgtgaa aaatacttctcatggtggat aatctttttt atgaagtttg cggtttgcgt 1080 ttttcccggt tct 1093 124350 PRT Aquifex aeolicus 124 Val Glu Thr Thr Ile Phe Gln Phe Gln Lys ThrPhe Phe Thr Lys Pro 1 5 10 15 Pro Lys Glu Arg Val Phe Val Leu His GlyGlu Glu Gln Tyr Leu Ile 20 25 30 Arg Thr Phe Leu Ser Lys Leu Lys Glu LysTyr Gly Glu Asn Tyr Thr 35 40 45 Val Leu Trp Gly Asp Glu Ile Ser Glu GluGlu Phe Tyr Thr Ala Leu 50 55 60 Ser Glu Thr Ser Ile Phe Gly Gly Ser LysGlu Lys Ala Val Val Ile 65 70 75 80 Tyr Asn Phe Gly Asp Phe Leu Lys LysLeu Gly Arg Lys Lys Lys Glu 85 90 95 Lys Glu Arg Leu Ile Lys Val Leu ArgAsn Val Lys Ser Asn Tyr Val 100 105 110 Phe Ile Val Tyr Asp Ala Lys LeuGln Lys Gln Glu Leu Ser Ser Glu 115 120 125 Pro Leu Lys Ser Val Ala SerPhe Gly Gly Ile Val Val Ala Asn Arg 130 135 140 Leu Ser Lys Glu Arg IleLys Gln Leu Val Leu Lys Lys Phe Lys Glu 145 150 155 160 Lys Gly Ile AsnVal Glu Asn Asp Ala Leu Glu Tyr Leu Leu Gln Leu 165 170 175 Thr Gly TyrAsn Leu Met Glu Leu Lys Leu Glu Val Glu Lys Leu Ile 180 185 190 Asp TyrAla Ser Glu Lys Lys Ile Leu Thr Leu Asp Glu Val Lys Arg 195 200 205 ValAla Phe Ser Val Ser Glu Asn Val Asn Val Phe Glu Phe Val Asp 210 215 220Leu Leu Leu Leu Lys Asp Tyr Glu Lys Ala Leu Lys Val Leu Asp Ser 225 230235 240 Leu Ile Ser Phe Gly Ile His Pro Leu Gln Ile Met Lys Ile Leu Ser245 250 255 Ser Tyr Ala Leu Lys Leu Tyr Thr Leu Lys Arg Leu Glu Glu LysGly 260 265 270 Glu Asp Leu Asn Lys Ala Met Glu Ser Val Gly Ile Lys AsnAsn Phe 275 280 285 Leu Lys Met Lys Phe Lys Ser Tyr Leu Lys Ala Asn SerLys Glu Asp 290 295 300 Leu Lys Asn Leu Ile Leu Ser Leu Gln Arg Ile AspAla Phe Ser Lys 305 310 315 320 Leu Tyr Phe Gln Asp Thr Val Gln Leu LeuArg Asp Phe Leu Thr Ser 325 330 335 Arg Leu Glu Arg Glu Val Val Lys AsnThr Ser His Gly Gly 340 345 350 125 1051 DNA Aquifex aeolicus 125atggaaaaag tttttttgga aaaactccag aaaaccttgc acatacccgg aggactcctt 60ttttacggca aagaaggaag cggaaagacg aaaacagctt ttgaatttgc aaaaggtatt 120ttatgtaagg aaaacgtacc tggggatgcg gaagttgtcc ctcctgcaaa cacgtaaacg 180agctggagga agccttcttt aaaggagaaa tagaagactt taaagtttat aagacaagga 240cggtaaaaag cacttcgttt accttatggg cgaacatccc gactttgtgg taataatccc 300gagcggacat tacataaaga tagaacagat aagggaagtt aagaactttg cctatgtgaa 360gcccgcacta agcaggagaa aagtaattat aatagacgac gcccacgcga tgacctctca 420ggcggcaaac gctcttttaa aggtattgga agagccacct gcggacacca cctttatctt 480gaccacgaac aggcgttctg caatcctgcc gactatcctc tccagaactt ttcaagtgga 540gttcaagggc ttttcagtaa aagaggttat ggaaatagcg aaagtagacg aggaaatagc 600gaaactctct ggaggcagtc taaaaagggc tatcttacta aaggaaaaca aagatatcct 660aaacaaagta aaggaattct tggaaaacga gccgttaaaa gtttacaagc ttgcaagtga 720attcgaaaag tgggaacctg aaaagcaaaa actcttcctt gaaattatgg aagaattggt 780atctcaaaaa ttgaccgaag agaaaaaaga caattacacc taccttcttg atacgatcag 840actctttaaa gacggactcg caaggggtgt aaacgaacct ctgtggctgt ttacgttagc 900cgttcaggcg gattaataaa ccgttattga ttccgtaaca tttaaacctt aatctaaatt 960atgagagcct ttgaaggagg tctggtatgg aaaatttgaa gattagatat atagatacga 1020ggaagatagg aaccgtgagc ggtgtaaaag t 1051 126 305 PRT Aquifex aeolicus 126Met Glu Lys Val Phe Leu Glu Lys Leu Gln Lys Thr Leu His Ile Pro 1 5 1015 Gly Gly Leu Leu Phe Tyr Gly Lys Glu Gly Ser Gly Lys Thr Lys Thr 20 2530 Ala Phe Glu Phe Ala Lys Gly Ile Leu Cys Lys Glu Asn Val Pro Trp 35 4045 Gly Cys Gly Ser Cys Pro Ser Cys Lys His Val Asn Glu Leu Glu Glu 50 5560 Ala Phe Phe Lys Gly Glu Ile Glu Asp Phe Lys Val Tyr Lys Asp Lys 65 7075 80 Asp Gly Lys Lys His Phe Val Tyr Leu Met Gly Glu His Pro Asp Phe 8590 95 Val Val Ile Ile Pro Ser Gly His Tyr Ile Lys Ile Glu Gln Ile Arg100 105 110 Glu Val Lys Asn Phe Ala Tyr Val Lys Pro Ala Leu Ser Arg ArgLys 115 120 125 Val Ile Ile Ile Asp Asp Ala His Ala Met Thr Ser Gln AlaAla Asn 130 135 140 Ala Leu Leu Lys Val Leu Glu Glu Pro Pro Ala Asp ThrThr Phe Ile 145 150 155 160 Leu Thr Thr Asn Arg Arg Ser Ala Ile Leu ProThr Ile Leu Ser Arg 165 170 175 Thr Phe Gln Val Glu Phe Lys Gly Phe SerVal Lys Glu Val Met Glu 180 185 190 Ile Ala Lys Val Asp Glu Glu Ile AlaLys Leu Ser Gly Gly Ser Leu 195 200 205 Lys Arg Ala Ile Leu Leu Lys GluAsn Lys Asp Ile Leu Asn Lys Val 210 215 220 Lys Glu Phe Leu Glu Asn GluPro Leu Lys Val Tyr Lys Leu Ala Ser 225 230 235 240 Glu Phe Glu Lys TrpGlu Pro Glu Lys Gln Lys Leu Phe Leu Glu Ile 245 250 255 Met Glu Glu LeuVal Ser Gln Lys Leu Thr Glu Glu Lys Lys Asp Asn 260 265 270 Tyr Thr TyrLeu Leu Asp Thr Ile Arg Leu Phe Lys Asp Gly Leu Ala 275 280 285 Arg GlyVal Asn Glu Pro Leu Trp Leu Phe Thr Leu Ala Val Gln Ala 290 295 300 Asp305 127 630 DNA Aquifex aeolicus 127 atgaacttcc tgaaaaagtt ccttttactgagaaaagctc aaaagtctcc ttacttcgaa 60 gagttctacg aagaaatcga tttgaaccagaaggtgaaag atgcaaggtt tgtagttttt 120 gactgcgaag ccacagaact cgacgtaaagaaggcaaaac tcctttcaat aggtgcggtt 180 gaggttaaaa acctggaaat agacctctctaaatcttttt acgagatact caaaagtgac 240 gagataaagg cggcggagat acatggaataaccagggaag acgttgaaaa gtacggaaag 300 gaaccaaagg aagtaatata cgactttctgaagtacataa agggaagcgt tctcgttggc 360 tactacgtga agtttgacgt ctcactcgttgagaagtact ccataaagta cttccagtat 420 ccaatcatca actacaagtt agacctgtttagtttcgtga agagagagta ccagagtggc 480 aggagtcttg acgaccttat gaaggaactcggtgtagaaa taagggcaag gcacaacgcc 540 cttgaagatg cctacataac cgctcttcttttcctaaagt acgtttaccc gaacagggag 600 tacagactaa aggatctccc gattttcctt630 128 210 PRT Aquifex aeolicus 128 Met Asn Phe Leu Lys Lys Phe Leu LeuLeu Arg Lys Ala Gln Lys Ser 1 5 10 15 Pro Tyr Phe Glu Glu Phe Tyr GluGlu Ile Asp Leu Asn Gln Lys Val 20 25 30 Lys Asp Ala Arg Phe Val Val PheAsp Cys Glu Ala Thr Glu Leu Asp 35 40 45 Val Lys Lys Ala Lys Leu Leu SerIle Gly Ala Val Glu Val Lys Asn 50 55 60 Leu Glu Ile Asp Leu Ser Lys SerPhe Tyr Glu Ile Leu Lys Ser Asp 65 70 75 80 Glu Ile Lys Ala Ala Glu IleHis Gly Ile Thr Arg Glu Asp Val Glu 85 90 95 Lys Tyr Gly Lys Glu Pro LysGlu Val Ile Tyr Asp Phe Leu Lys Tyr 100 105 110 Ile Lys Gly Ser Val LeuVal Gly Tyr Tyr Val Lys Phe Asp Val Ser 115 120 125 Leu Val Glu Lys TyrSer Ile Lys Tyr Phe Gln Tyr Pro Ile Ile Asn 130 135 140 Tyr Lys Leu AspLeu Phe Ser Phe Val Lys Arg Glu Tyr Gln Ser Gly 145 150 155 160 Arg SerLeu Asp Asp Leu Met Lys Glu Leu Gly Val Glu Ile Arg Ala 165 170 175 ArgHis Asn Ala Leu Glu Asp Ala Tyr Ile Thr Ala Leu Leu Phe Leu 180 185 190Lys Tyr Val Tyr Pro Asn Arg Glu Tyr Arg Leu Lys Asp Leu Pro Ile 195 200205 Phe Leu 210 129 526 DNA Aquifex aeolicus 129 atgctcaata aggtttttataataggaaga cttacgggtg accccgttat aacttatcta 60 ccgagcggaa cgcccgtagtagagtttact ctggcttaca acagaaggta taaaaaccag 120 aacggtgaat ttcaggaggaaagtcacttc tttgacgtaa aggcgtacgg aaaaatggct 180 gaagactggg ctacacgcttctcgaaagga tacctcgtac tcgtagaggg aagactctcc 240 caggaaaagt gggagaaagaaggaaagaag ttctcaaagg tcaggataat agcggaaaac 300 gtaagattaa taaacaggccgaaaggtgct gaacttcaag cagaagaaga ggaggaagtt 360 cctcccattg aggaggaaattgaaaaactc ggtaaagagg aagagaagcc ttttaccgat 420 gaagaggacg aaatacctttttaattttga ggaggttaaa gtatggtagt gagagctcct 480 aagaagaaag tttgtatgtactgtgaacaa aagagagagc cagatt 526 130 147 PRT Aquifex aeolicus 130 MetLeu Asn Lys Val Phe Ile Ile Gly Arg Leu Thr Gly Asp Pro Val 1 5 10 15Ile Thr Tyr Leu Pro Ser Gly Thr Pro Val Val Glu Phe Thr Leu Ala 20 25 30Tyr Asn Arg Arg Tyr Lys Asn Gln Asn Gly Glu Phe Gln Glu Glu Ser 35 40 45His Phe Phe Asp Val Lys Ala Tyr Gly Lys Met Ala Glu Asp Trp Ala 50 55 60Thr Arg Phe Ser Lys Gly Tyr Leu Val Leu Val Glu Gly Arg Leu Ser 65 70 7580 Gln Glu Lys Trp Glu Lys Glu Gly Lys Lys Phe Ser Lys Val Arg Ile 85 9095 Ile Ala Glu Asn Val Arg Leu Ile Asn Arg Pro Lys Gly Ala Glu Leu 100105 110 Gln Ala Glu Glu Glu Glu Glu Val Pro Pro Ile Glu Glu Glu Ile Glu115 120 125 Lys Leu Gly Lys Glu Glu Glu Lys Pro Phe Thr Asp Glu Glu AspGlu 130 135 140 Ile Pro Phe 145 131 1472 DNA Aquifex aeolicus 131atgcaatttg tggataaact tccctgtgac gaatccgccg agagggcggt tcttggcagt 60atgcttgaag accccgaaaa catacctctg gtacttgaat accttaaaga agaagacttc 120tgcatagacg agcacaagct acttttcagg gttcttacaa acctctggtc cgagtacggc 180aataagctcg atttcgtatt aataaaggat caccttgaaa agaaaaactt actccagaaa 240atacctatag actggctcga agaactctac gaggaggcgg tatcccctga cacgcttgag 300gaagtctgca aaatagtaaa acaacgttcc gcacagaggg cgataattca actcggtata 360gaactcattc acaaaggaaa ggaaaacaaa gactttcaca cattaatcga ggaagcccag 420agcaggatat tttccatagc ggaaagtgct acatctacgc agttttacca tgtgaaagac 480gttgcggaag aagttataga actcatttat aaattcaaaa gctctgacag gctagtcacg 540ggactcccaa gcggtttcac ggaactcgat ctaaagacga cgggattcca ccctggagac 600ttaataatac tcgccgcaag acccggtatg gggaaaaccg cctttatgct ctccataatc 660tacaatctcg caaaagacga gggaaaaccc tcagctgtat tttccttgga aatgagcaag 720gaacagctcg ttatgagact cctctctatg atgtcggagg tcccactttt caagataagg 780tctggaagta tatcgaatga agatttaaag aagcttgaag caagcgcaat agaactcgca 840aagtacgaca tatacctcga cgacacaccc gctctcacta caacggattt aaggataagg 900gcaagaaagc tcagaaagga aaaggaagtt gagttcgtgg cggtggacta cttgcaactt 960ctgagaccgc cagtccgaaa gagttcaaga caggaggaag tggcagaggt ttcaagaaac 1020ttaaaagccc ttgcaaagga acttcacatt cccgttatgg cacttgcgca gctctcccgt 1080gaggtggaaa agaggagtga taaaagaccc cagcttgcgg acctcagaga atccggacag 1140atagaacagg acgcagacct aatccttttc ctccacagac ccgagtacta caagaaaaag 1200ccaaatcccg aagagcaggg tatagcggaa gtgataatag ccaagcaaag gcaaggaccc 1260acggacattg tgaagctcgc atttattaag gagtacacta agtttgcaaa cctagaagcc 1320cttcctgaac aacctcctga agaagaggaa ctttccgaaa ttattgaaac acaggaggat 1380gaaggattcg aagatattga cttctgaaaa ttaaggtttt ataattttat cttggctatc 1440cggggtagct caatcggcag agcgggtggc tg 1472 132 438 PRT Aquifex aeolicus132 Met Gln Phe Val Asp Lys Leu Pro Cys Asp Glu Ser Ala Glu Arg Ala 1 510 15 Val Leu Gly Ser Met Leu Glu Asp Pro Glu Asn Ile Pro Leu Val Leu 2025 30 Glu Tyr Leu Lys Glu Glu Asp Phe Cys Ile Asp Glu His Lys Leu Leu 3540 45 Phe Arg Val Leu Thr Asn Leu Trp Ser Glu Tyr Gly Asn Lys Leu Asp 5055 60 Phe Val Leu Ile Lys Asp His Leu Glu Lys Lys Asn Leu Leu Gln Lys 6570 75 80 Ile Pro Ile Asp Trp Leu Glu Glu Leu Tyr Glu Glu Ala Val Ser Pro85 90 95 Asp Thr Leu Glu Glu Val Cys Lys Ile Val Lys Gln Arg Ser Ala Gln100 105 110 Arg Ala Ile Ile Gln Leu Gly Ile Thr Ser Thr Gln Phe Tyr HisVal 115 120 125 Lys Asp Val Ala Glu Glu Val Ile Glu Leu Ile Tyr Lys PheLys Ser 130 135 140 Ser Asp Arg Leu Val Thr Gly Leu Pro Ser Gly Phe ThrGlu Leu Asp 145 150 155 160 Leu Lys Thr Thr Gly Phe His Pro Gly Asp LeuIle Ile Leu Ala Ala 165 170 175 Arg Pro Gly Met Gly Lys Thr Ala Phe MetLeu Ser Ile Ile Tyr Asn 180 185 190 Leu Ala Lys Asp Glu Gly Lys Pro SerAla Val Phe Ser Leu Glu Met 195 200 205 Ser Lys Glu Gln Leu Val Met ArgLeu Leu Ser Met Met Ser Glu Val 210 215 220 Pro Leu Phe Lys Ile Arg SerGly Ser Ile Ser Asn Glu Asp Leu Lys 225 230 235 240 Lys Leu Glu Ala SerAla Ile Glu Leu Ala Lys Tyr Asp Ile Tyr Leu 245 250 255 Asp Asp Thr ProAla Leu Thr Thr Thr Asp Leu Arg Ile Arg Ala Arg 260 265 270 Lys Leu ArgLys Glu Lys Glu Val Glu Phe Val Ala Val Asp Tyr Leu 275 280 285 Gln LeuLeu Arg Pro Pro Val Arg Lys Ser Ser Arg Gln Glu Glu Val 290 295 300 AlaGlu Val Ser Arg Asn Leu Lys Ala Leu Ala Lys Glu Leu His Ile 305 310 315320 Pro Val Met Ala Leu Ala Gln Leu Ser Arg Glu Val Glu Lys Arg Ser 325330 335 Asp Lys Arg Pro Gln Leu Ala Asp Leu Arg Glu Ser Gly Gln Ile Glu340 345 350 Gln Asp Ala Asp Leu Ile Leu Phe Leu His Arg Pro Glu Tyr TyrLys 355 360 365 Lys Lys Pro Asn Pro Glu Glu Gln Gly Ile Ala Glu Val IleIle Ala 370 375 380 Lys Gln Arg Gln Gly Pro Thr Asp Ile Val Lys Leu AlaPhe Ile Lys 385 390 395 400 Glu Tyr Thr Lys Phe Ala Asn Leu Glu Ala LeuPro Glu Gln Pro Pro 405 410 415 Glu Glu Glu Glu Leu Ser Glu Ile Ile GluThr Gln Glu Asp Glu Gly 420 425 430 Phe Glu Asp Ile Asp Phe 435 133 1526DNA Aquifex aeolicus 133 atgtcctcgg acatagacga acttagacgg gaaatagatatagtagacgt catttccgaa 60 tacttaaact tagagaaggt aggttccaat tacagaacgaactgtccctt tcaccctgac 120 gatacaccct ccttttacgt gtctccaagt aaacaaatattcaagtgttt cggttgcggg 180 gtagggggag acgcgataaa gttcgtttcc ctttacgaggacatctccta ttttgaagcc 240 gcccttgaac tcgcaaaacg ctacggaaag aaattagaccttgaaaagat atcaaaagac 300 gaaaaggtat acgtggctct tgacagggtt tgtgatttctacagggaaag ccttctcaaa 360 aacagagagg caagtgagta cgtaaagagt aggggaatagaccctaaagt agcgaggaag 420 tttgatcttg ggtacgcacc ttccagtgaa gcactcgtaaaagtcttaaa agagaacgat 480 cttttagagg cttaccttga aactaaaaac ctcctttctcctacgaaggg tgtttacagg 540 gatctctttc ttcggcgtgt cgtgatcccg ataaaggatccgaggggaag agttataggt 600 ttcggtggaa ggaggatagt agaggacaaa tctcccaagtacataaactc tccagacagc 660 agggtattta aaaaggggga gaacttattc ggtctttacgaggcaaagga gtatataaag 720 gaagaaggat ttgcgatact tgtggaaggg tactttgaccttttgagact tttttccgag 780 ggaataagga acgttgttgc acccctcggt acagccctgacccaaaatca ggcaaacctc 840 ctttccaagt tcacaaaaaa ggtctacatc ctttacgacggagatgatgc gggaagaaag 900 gctatgaaaa gtgccattcc cctactcctc agtgcaggagtggaagttta tcccgtttac 960 ctccccgaag gatacgatcc cgacgagttt ataaaggaattcgggaaaga ggaattaaga 1020 agactgataa acagctcagg ggagctcttt gaaacgctcataaaaaccgc aagggaaaac 1080 ttagaggaga aaacgcgtga gttcaggtat tatctgggctttatttccga tggagtaagg 1140 cgctttgctc tggcttcgga gtttcacacc aagtacaaagttcctatgga aattttatta 1200 atgaaaattg aaaaaaattc tcaagaaaaa gaaattaaactctcctttaa ggaaaaaatc 1260 ttcctgaaag gactgataga attaaaacca aaaatagaccttgaagtcct gaacttaagt 1320 cctgagttaa aggaactcgc agttaacgcc ttaaacggagaggagcattt acttccaaaa 1380 gaagttctcg agtaccaggt ggataacttg gagaaactttttaacaacat ccttagggat 1440 ttacaaaaat ctgggaaaaa gaggaagaaa agagggttgaaaaatgtaaa tacttaatta 1500 actttaataa atttttagag ttagga 1526 134 498 PRTAquifex aeolicus 134 Met Ser Ser Asp Ile Asp Glu Leu Arg Arg Glu Ile AspIle Val Asp 1 5 10 15 Val Ile Ser Glu Tyr Leu Asn Leu Glu Lys Val GlySer Asn Tyr Arg 20 25 30 Thr Asn Cys Pro Phe His Pro Asp Asp Thr Pro SerPhe Tyr Val Ser 35 40 45 Pro Ser Lys Gln Ile Phe Lys Cys Phe Gly Cys GlyVal Gly Gly Asp 50 55 60 Ala Ile Lys Phe Val Ser Leu Tyr Glu Asp Ile SerTyr Phe Glu Ala 65 70 75 80 Ala Leu Glu Leu Ala Lys Arg Tyr Gly Lys LysLeu Asp Leu Glu Lys 85 90 95 Ile Ser Lys Asp Glu Lys Val Tyr Val Ala LeuAsp Arg Val Cys Asp 100 105 110 Phe Tyr Arg Glu Ser Leu Leu Lys Asn ArgGlu Ala Ser Glu Tyr Val 115 120 125 Lys Ser Arg Gly Ile Asp Pro Lys ValAla Arg Lys Phe Asp Leu Gly 130 135 140 Tyr Ala Pro Ser Ser Glu Ala LeuVal Lys Val Leu Lys Glu Asn Asp 145 150 155 160 Leu Leu Glu Ala Tyr LeuGlu Thr Lys Asn Leu Leu Ser Pro Thr Lys 165 170 175 Gly Val Tyr Arg AspLeu Phe Leu Arg Arg Val Val Ile Pro Ile Lys 180 185 190 Asp Pro Arg GlyArg Val Ile Gly Phe Gly Gly Arg Arg Ile Val Glu 195 200 205 Asp Lys SerPro Lys Tyr Ile Asn Ser Pro Asp Ser Arg Val Phe Lys 210 215 220 Lys GlyGlu Asn Leu Phe Gly Leu Tyr Glu Ala Lys Glu Tyr Ile Lys 225 230 235 240Glu Glu Gly Phe Ala Ile Leu Val Glu Gly Tyr Phe Asp Leu Leu Arg 245 250255 Leu Phe Ser Glu Gly Ile Arg Asn Val Val Ala Pro Leu Gly Thr Ala 260265 270 Leu Thr Gln Asn Gln Ala Asn Leu Leu Ser Lys Phe Thr Lys Lys Val275 280 285 Tyr Ile Leu Tyr Asp Gly Asp Asp Ala Gly Arg Lys Ala Met LysSer 290 295 300 Ala Ile Pro Leu Leu Leu Ser Ala Gly Val Glu Val Tyr ProVal Tyr 305 310 315 320 Leu Pro Glu Gly Tyr Asp Pro Asp Glu Phe Ile LysGlu Phe Gly Lys 325 330 335 Glu Glu Leu Arg Arg Leu Ile Asn Ser Ser GlyGlu Leu Phe Glu Thr 340 345 350 Leu Ile Lys Thr Ala Arg Glu Asn Leu GluGlu Lys Thr Arg Glu Phe 355 360 365 Arg Tyr Tyr Leu Gly Phe Ile Ser AspGly Val Arg Arg Phe Ala Leu 370 375 380 Ala Ser Glu Phe His Thr Lys TyrLys Val Pro Met Glu Ile Leu Leu 385 390 395 400 Met Lys Ile Glu Lys AsnSer Gln Glu Lys Glu Ile Lys Leu Ser Phe 405 410 415 Lys Glu Lys Ile PheLeu Lys Gly Leu Ile Glu Leu Lys Pro Lys Ile 420 425 430 Asp Leu Glu ValLeu Asn Leu Ser Pro Glu Leu Lys Glu Leu Ala Val 435 440 445 Asn Ala LeuAsn Gly Glu Glu His Leu Leu Pro Lys Glu Val Leu Glu 450 455 460 Tyr GlnVal Asp Asn Leu Glu Lys Leu Phe Asn Asn Ile Leu Arg Asp 465 470 475 480Leu Gln Lys Ser Gly Lys Lys Arg Lys Lys Arg Gly Leu Lys Asn Val 485 490495 Asn Thr 135 705 DNA Aquifex aeolicus 135 atgcaagata ccgctacctgcagtatttgt caggggacgg gattcgtaaa gaccgaagac 60 aacaaggtaa ggctctgcgaatgcaggttc aagaaaaggg atgtaaacag ggaactaaac 120 atcccaaaga ggtactggaacgccaactta gacacttacc accccaagaa cgtatcccag 180 aacagggcac ttttgacgataagggtcttc gtccacaact tcaatcccga ggaagggaaa 240 gggcttacct ttgtaggatctcctggagtc ggcaaaactc accttgcggt tgcaacatta 300 aaagcgattt atgagaagaagggaatcaga ggatacttct tcgatacgaa ggatctaata 360 ttcaggttaa aacacttaatggacgaggga aaggatacaa agtttttaaa aactgtctta 420 aactcaccgg ttttggttctcgacgacctc ggttctgaga ggctcagtga ctggcagagg 480 gaactcatct cttacataatcacttacagg tataacaacc ttaagagcac gataataacc 540 acgaattact cactccagagggaagaagag agtagcgtga ggataagtgc ggatcttgca 600 agcagactcg gagaaaacgtagtttcaaaa atttacgaga tgaacgagtt gctcgttata 660 aagggttccg acctcaggaagtctaaaaag ctatcaaccc catct 705 136 235 PRT Aquifex aeolicus 136 Met GlnAsp Thr Ala Thr Cys Ser Ile Cys Gln Gly Thr Gly Phe Val 1 5 10 15 LysThr Glu Asp Asn Lys Val Arg Leu Cys Glu Cys Arg Phe Lys Lys 20 25 30 ArgAsp Val Asn Arg Glu Leu Asn Ile Pro Lys Arg Tyr Trp Asn Ala 35 40 45 AsnLeu Asp Thr Tyr His Pro Lys Asn Val Ser Gln Asn Arg Ala Leu 50 55 60 LeuThr Ile Arg Val Phe Val His Asn Phe Asn Pro Glu Glu Gly Lys 65 70 75 80Gly Leu Thr Phe Val Gly Ser Pro Gly Val Gly Lys Thr His Leu Ala 85 90 95Val Ala Thr Leu Lys Ala Ile Tyr Glu Lys Lys Gly Ile Arg Gly Tyr 100 105110 Phe Phe Asp Thr Lys Asp Leu Ile Phe Arg Leu Lys His Leu Met Asp 115120 125 Glu Gly Lys Asp Thr Lys Phe Leu Lys Thr Val Leu Asn Ser Pro Val130 135 140 Leu Val Leu Asp Asp Leu Gly Ser Glu Arg Leu Ser Asp Trp GlnArg 145 150 155 160 Glu Leu Ile Ser Tyr Ile Ile Thr Tyr Arg Tyr Asn AsnLeu Lys Ser 165 170 175 Thr Ile Ile Thr Thr Asn Tyr Ser Leu Gln Arg GluGlu Glu Ser Ser 180 185 190 Val Arg Ile Ser Ala Asp Leu Ala Ser Arg LeuGly Glu Asn Val Val 195 200 205 Ser Lys Ile Tyr Glu Met Asn Glu Leu LeuVal Ile Lys Gly Ser Asp 210 215 220 Leu Arg Lys Ser Lys Lys Leu Ser ThrPro Ser 225 230 235 137 4101 DNA Thermatoga maritima 137 atgaaaaagattgaaaattt gaagtggaaa aatgtctcgt ttaaaagcct ggaaatagat 60 cccgatgcaggtgtggttct cgtttccgtg gaaaaattct ccgaagagat agaagacctt 120 gtgcgtttactggagaagaa gacgcggttt cgagtcatcg tgaacggtgt tcaaaaaagt 180 aacggggatctaaggggaaa gatactttcc cttctcaacg gtaatgtgcc ttacataaaa 240 gatgttgttttcgaaggaaa caggctgatt ctgaaagtgc ttggagattt cgcgcgggac 300 aggatcgcctccaaactcag aagcacgaaa aaacagctcg atgaactgct gcctcccgga 360 acagagatcatgctggaggt tgtggagcct ccggaagatc ttttgaaaaa ggaagtacca 420 caaccagaaaagagagaaga accaaagggt gaagaattga agatcgagga tgaaaaccac 480 atctttggacagaaacccag aaagatcgtc ttcaccccct caaaaatctt tgagtacaac 540 aaaaagacatcggtgaaggg caagatcttc aaaatagaga agatcgaggg gaaaagaacg 600 gtccttctgatttacctgac agacggagaa gattctctga tctgcaaagt cttcaacgac 660 gttgaaaaggtcgaagggaa agtatcggtg ggagacgtga tcgttgccac aggagacctc 720 cttctcgaaaacggggagcc caccctttac gtgaagggaa tcacaaaact tcccgaagcg 780 aaaaggatggacaaatctcc ggttaagagg gtggagctcc acgcccatac caagttcagc 840 gatcaggacgcaataacaga tgtgaacgaa tatgtgaaac gagccaagga atggggcttt 900 cccgcgatagccctcacgga tcatgggaac gttcaggcca taccttactt ctacgacgcg 960 gcgaaagaagctggaataaa gcccattttc ggtatcgaag cgtatctggt gagtgacgtg 1020 gagcccgtcataaggaatct ctccgacgat tcgacgtttg gagatgccac gttcgtcgtc 1080 ctcgacttcgagacgacggg tctcgacccg caggtggatg agatcatcga gataggagcg 1140 gtgaagatacagggtggcca gatagtggac gagtaccaca ctctcataaa gccttccagg 1200 gagatctcaagaaaaagttc ggagatcacc ggaatcactc aagagatgct ggaaaacaag 1260 agaagcatcgaggaagttct gccggagttc ctcggttttc tggaagattc catcatcgta 1320 gcacacaacgccaacttcga ctacagattt ctgaggctgt ggatcaaaaa agtgatggga 1380 ttggactgggaaagacccta catagatacg ctcgccctcg caaagtccct tctcaaactg 1440 agaagctactctctggattc cgttgtggaa aagctcggat tgggtccctt ccggcaccac 1500 agggccctggatgacgcgag ggtcaccgct caggttttcc tcaggttcgt tgagatgatg 1560 aagaagatcggtatcacgaa gctttcagaa atggagaagt tgaaggatac gatagactac 1620 accgcgttgaaacccttcca ctgcacgatc ctcgttcaga acaaaaaggg attgaaaaac 1680 ctatacaaactggtttctga ttcctatata aagtacttct acggtgttcc gaggatcctc 1740 aaaagtgagctcatcgagaa cagagaagga ctgctcgtgg gtagcgcgtg tatctccggt 1800 gagctcggacgtgccgccct cgaaggagcg agtgattcag aactcgaaga gatcgcgaag 1860 ttctacgactacatagaagt catgccgctc gacgttatag ccgaagatga agaagaccta 1920 gacagagaaagactgaaaga agtgtaccga aaactctaca gaatagcgaa aaaattgaac 1980 aagttcgtcgtcatgaccgg tgatgttcat ttcctcgatc ccgaagatgc caggggcaga 2040 gctgcacttctggcacctca gggaaacaga aacttcgaga atcagcccgc actctacctc 2100 agaacgaccgaagaaatgct cgagaaggcg atagagatat tcgaagatga agagatcgcg 2160 agggaagtcgtgatagagaa tcccaacaga atagccgata tgatcgagga agtgcagccg 2220 ctcgagaaaaaacttcaccc gccgatcata gagaacgccg atgaaatagt gagaaacctc 2280 accatgaagcgggcgtacga gatctacggt gatccgcttc ccgaaatcgt ccagaagcgt 2340 gtggaaaaggaactgaacgc catcataaat catggatacg ccgttctcta tctcatcgct 2400 caggagctcgttcagaaatc tatgagcgat ggttacgtgg ttggatccag aggatccgtc 2460 gggtcttcactcgtggccaa tctcctcgga ataacagagg tgaatcccct accaccacat 2520 tacaggtgtccagagtgcaa atactttgaa gttgtcgaag acgacagata cggagcgggt 2580 tacgaccttcccaacaagaa ctgtccaaga tgtggggctc ctctcagaaa agacggccac 2640 ggcataccgtttgaaacgtt catggggttc gagggtgaca aggtccccga catagatctc 2700 aacttctcaggagagtatca ggaacgtgct catcgttttg tggaagaact cttcggtaaa 2760 gaccacgtctatagggcggg aaccataaac accatcgcgg aaagaagtgc ggtgggttac 2820 gtgagaagctacgaagagaa aaccggaaag aagctcagaa aggcggaaat ggaaagactc 2880 gtttccatgatcacgggagt gaagagaacg acgggtcagc acccaggggg gctcatgatc 2940 ataccgaaagacaaagaagt ctacgatttc actcccatac agtatccagc caacgataga 3000 aacgcaggtgtgttcaccac gcacttcgca tacgagacga tccatgatga cctggtgaag 3060 atagatgcgctcggccacga tgatcccact ttcatcaaga tgctcaagga cctcaccgga 3120 atcgatcccatgacgattcc catggatgac cccgatacgc tcgccatatt cagttctgtg 3180 aagcctcttggtgtggatcc cgttgagctg gaaagcgatg tgggaacgta cggaattccg 3240 gagttcggaaccgagtttgt gaggggaatg ctcgttgaaa cgagaccaaa gagtttcgcc 3300 gagcttgtgagaatctcagg actgtcacac ggtacggacg tctggttgaa caacgcacgt 3360 gattggataaacctcggcta cgccaagctc tccgaggtta tctcgtgtag ggacgacatc 3420 atgaacttcctcatacacaa aggaatggaa ccgtcacttg ccttcaagat catggaaaac 3480 gtcaggaagggaaagggtat cacagaagag atggagagcg agatgagaag gctgaaggtt 3540 ccagaatggttcatcgaatc ctgtaaaagg atcaaatatc tcttcccgaa agctcacgct 3600 gtggcttacgtgagtatggc cttcagaatt gcttacttca aggttcacta tcctcttcag 3660 ttttacgcggcgtacttcac gataaaaggt gatcagttcg atccggttct cgtactcagg 3720 ggaaaagaagccataaagag gcgcttgaga gaactcaaag cgatgcctgc caaagacgcc 3780 cagaagaaaaacgaagtgag tgttctggag gttgccctgg aaatgatact gagaggtttt 3840 tccttcctaccgcccgacat cttcaaatcc gacgcgaaga aatttctgat agaaggaaac 3900 tcgctgagaattccgttcaa caaacttcca ggactgggtg acagcgttgc cgagtcgata 3960 atcagagccagggaagaaaa gccgttcact tcggtggaag atctcatgaa gaggaccaag 4020 gtcaacaaaaatcacataga gctgatgaaa agcctgggtg ttctcgggga ccttccagag 4080 acggaacagttcacgctttt c 4101 138 1367 PRT Thermatoga maritima 138 Met Lys Lys IleGlu Asn Leu Lys Trp Lys Asn Val Ser Phe Lys Ser 1 5 10 15 Leu Glu IleAsp Pro Asp Ala Gly Val Val Leu Val Ser Val Glu Lys 20 25 30 Phe Ser GluGlu Ile Glu Asp Leu Val Arg Leu Leu Glu Lys Lys Thr 35 40 45 Arg Phe ArgVal Ile Val Asn Gly Val Gln Lys Ser Asn Gly Asp Leu 50 55 60 Arg Gly LysIle Leu Ser Leu Leu Asn Gly Asn Val Pro Tyr Ile Lys 65 70 75 80 Asp ValVal Phe Glu Gly Asn Arg Leu Ile Leu Lys Val Leu Gly Asp 85 90 95 Phe AlaArg Asp Arg Ile Ala Ser Lys Leu Arg Ser Thr Lys Lys Gln 100 105 110 LeuAsp Glu Leu Leu Pro Pro Gly Thr Glu Ile Met Leu Glu Val Val 115 120 125Glu Pro Pro Glu Asp Leu Leu Lys Lys Glu Val Pro Gln Pro Glu Lys 130 135140 Arg Glu Glu Pro Lys Gly Glu Glu Leu Lys Ile Glu Asp Glu Asn His 145150 155 160 Ile Phe Gly Gln Lys Pro Arg Lys Ile Val Phe Thr Pro Ser LysIle 165 170 175 Phe Glu Tyr Asn Lys Lys Thr Ser Val Lys Gly Lys Ile PheLys Ile 180 185 190 Glu Lys Ile Glu Gly Lys Arg Thr Val Leu Leu Ile TyrLeu Thr Asp 195 200 205 Gly Glu Asp Ser Leu Ile Cys Lys Val Phe Asn AspVal Glu Lys Val 210 215 220 Glu Gly Lys Val Ser Val Gly Asp Val Ile ValAla Thr Gly Asp Leu 225 230 235 240 Leu Leu Glu Asn Gly Glu Pro Thr LeuTyr Val Lys Gly Ile Thr Lys 245 250 255 Leu Pro Glu Ala Lys Arg Met AspLys Ser Pro Val Lys Arg Val Glu 260 265 270 Leu His Ala His Thr Lys PheSer Asp Gln Asp Ala Ile Thr Asp Val 275 280 285 Asn Glu Tyr Val Lys ArgAla Lys Glu Trp Gly Phe Pro Ala Ile Ala 290 295 300 Leu Thr Asp His GlyAsn Val Gln Ala Ile Pro Tyr Phe Tyr Asp Ala 305 310 315 320 Ala Lys GluAla Gly Ile Lys Pro Ile Phe Gly Ile Glu Ala Tyr Leu 325 330 335 Val SerAsp Val Glu Pro Val Ile Arg Asn Leu Ser Asp Asp Ser Thr 340 345 350 PheGly Asp Ala Thr Phe Val Val Leu Asp Phe Glu Thr Thr Gly Leu 355 360 365Asp Pro Gln Val Asp Glu Ile Ile Glu Ile Gly Ala Val Lys Ile Gln 370 375380 Gly Gly Gln Ile Val Asp Glu Tyr His Thr Leu Ile Lys Pro Ser Arg 385390 395 400 Glu Ile Ser Arg Lys Ser Ser Glu Ile Thr Gly Ile Thr Gln GluMet 405 410 415 Leu Glu Asn Lys Arg Ser Ile Glu Glu Val Leu Pro Glu PheLeu Gly 420 425 430 Phe Leu Glu Asp Ser Ile Ile Val Ala His Asn Ala AsnPhe Asp Tyr 435 440 445 Arg Phe Leu Arg Leu Trp Ile Lys Lys Val Met GlyLeu Asp Trp Glu 450 455 460 Arg Pro Tyr Ile Asp Thr Leu Ala Leu Ala LysSer Leu Leu Lys Leu 465 470 475 480 Arg Ser Tyr Ser Leu Asp Ser Val ValGlu Lys Leu Gly Leu Gly Pro 485 490 495 Phe Arg His His Arg Ala Leu AspAsp Ala Arg Val Thr Ala Gln Val 500 505 510 Phe Leu Arg Phe Val Glu MetMet Lys Lys Ile Gly Ile Thr Lys Leu 515 520 525 Ser Glu Met Glu Lys LeuLys Asp Thr Ile Asp Tyr Thr Ala Leu Lys 530 535 540 Pro Phe His Cys ThrIle Leu Val Gln Asn Lys Lys Gly Leu Lys Asn 545 550 555 560 Leu Tyr LysLeu Val Ser Asp Ser Tyr Ile Lys Tyr Phe Tyr Gly Val 565 570 575 Pro ArgIle Leu Lys Ser Glu Leu Ile Glu Asn Arg Glu Gly Leu Leu 580 585 590 ValGly Ser Ala Cys Ile Ser Gly Glu Leu Gly Arg Ala Ala Leu Glu 595 600 605Gly Ala Ser Asp Ser Glu Leu Glu Glu Ile Ala Lys Phe Tyr Asp Tyr 610 615620 Ile Glu Val Met Pro Leu Asp Val Ile Ala Glu Asp Glu Glu Asp Leu 625630 635 640 Asp Arg Glu Arg Leu Lys Glu Val Tyr Arg Lys Leu Tyr Arg IleAla 645 650 655 Lys Lys Leu Asn Lys Phe Val Val Met Thr Gly Asp Val HisPhe Leu 660 665 670 Asp Pro Glu Asp Ala Arg Gly Arg Ala Ala Leu Leu AlaPro Gln Gly 675 680 685 Asn Arg Asn Phe Glu Asn Gln Pro Ala Leu Tyr LeuArg Thr Thr Glu 690 695 700 Glu Met Leu Glu Lys Ala Ile Glu Ile Phe GluAsp Glu Glu Ile Ala 705 710 715 720 Arg Glu Val Val Ile Glu Asn Pro AsnArg Ile Ala Asp Met Ile Glu 725 730 735 Glu Val Gln Pro Leu Glu Lys LysLeu His Pro Pro Ile Ile Glu Asn 740 745 750 Ala Asp Glu Ile Val Arg AsnLeu Thr Met Lys Arg Ala Tyr Glu Ile 755 760 765 Tyr Gly Asp Pro Leu ProGlu Ile Val Gln Lys Arg Val Glu Lys Glu 770 775 780 Leu Asn Ala Ile IleAsn His Gly Tyr Ala Val Leu Tyr Leu Ile Ala 785 790 795 800 Gln Glu LeuVal Gln Lys Ser Met Ser Asp Gly Tyr Val Val Gly Ser 805 810 815 Arg GlySer Val Gly Ser Ser Leu Val Ala Asn Leu Leu Gly Ile Thr 820 825 830 GluVal Asn Pro Leu Pro Pro His Tyr Arg Cys Pro Glu Cys Lys Tyr 835 840 845Phe Glu Val Val Glu Asp Asp Arg Tyr Gly Ala Gly Tyr Asp Leu Pro 850 855860 Asn Lys Asn Cys Pro Arg Cys Gly Ala Pro Leu Arg Lys Asp Gly His 865870 875 880 Gly Ile Pro Phe Glu Thr Phe Met Gly Phe Glu Gly Asp Lys ValPro 885 890 895 Asp Ile Asp Leu Asn Phe Ser Gly Glu Tyr Gln Glu Arg AlaHis Arg 900 905 910 Phe Val Glu Glu Leu Phe Gly Lys Asp His Val Tyr ArgAla Gly Thr 915 920 925 Ile Asn Thr Ile Ala Glu Arg Ser Ala Val Gly TyrVal Arg Ser Tyr 930 935 940 Glu Glu Lys Thr Gly Lys Lys Leu Arg Lys AlaGlu Met Glu Arg Leu 945 950 955 960 Val Ser Met Ile Thr Gly Val Lys ArgThr Thr Gly Gln His Pro Gly 965 970 975 Gly Leu Met Ile Ile Pro Lys AspLys Glu Val Tyr Asp Phe Thr Pro 980 985 990 Ile Gln Tyr Pro Ala Asn AspArg Asn Ala Gly Val Phe Thr Thr His 995 1000 1005 Phe Ala Tyr Glu ThrIle His Asp Asp Leu Val Lys Ile Asp Ala Leu 1010 1015 1020 Gly His AspAsp Pro Thr Phe Ile Lys Met Leu Lys Asp Leu Thr Gly 1025 1030 1035 1040Ile Asp Pro Met Thr Ile Pro Met Asp Asp Pro Asp Thr Leu Ala Ile 10451050 1055 Phe Ser Ser Val Lys Pro Leu Gly Val Asp Pro Val Glu Leu GluSer 1060 1065 1070 Asp Val Gly Thr Tyr Gly Ile Pro Glu Phe Gly Thr GluPhe Val Arg 1075 1080 1085 Gly Met Leu Val Glu Thr Arg Pro Lys Ser PheAla Glu Leu Val Arg 1090 1095 1100 Ile Ser Gly Leu Ser His Gly Thr AspVal Trp Leu Asn Asn Ala Arg 1105 1110 1115 1120 Asp Trp Ile Asn Leu GlyTyr Ala Lys Leu Ser Glu Val Ile Ser Cys 1125 1130 1135 Arg Asp Asp IleMet Asn Phe Leu Ile His Lys Gly Met Glu Pro Ser 1140 1145 1150 Leu AlaPhe Lys Ile Met Glu Asn Val Arg Lys Gly Lys Gly Ile Thr 1155 1160 1165Glu Glu Met Glu Ser Glu Met Arg Arg Leu Lys Val Pro Glu Trp Phe 11701175 1180 Ile Glu Ser Cys Lys Arg Ile Lys Tyr Leu Phe Pro Lys Ala HisAla 1185 1190 1195 1200 Val Ala Tyr Val Ser Met Ala Phe Arg Ile Ala TyrPhe Lys Val His 1205 1210 1215 Tyr Pro Leu Gln Phe Tyr Ala Ala Tyr PheThr Ile Lys Gly Asp Gln 1220 1225 1230 Phe Asp Pro Val Leu Val Leu ArgGly Lys Glu Ala Ile Lys Arg Arg 1235 1240 1245 Leu Arg Glu Leu Lys AlaMet Pro Ala Lys Asp Ala Gln Lys Lys Asn 1250 1255 1260 Glu Val Ser ValLeu Glu Val Ala Leu Glu Met Ile Leu Arg Gly Phe 1265 1270 1275 1280 SerPhe Leu Pro Pro Asp Ile Phe Lys Ser Asp Ala Lys Lys Phe Leu 1285 12901295 Ile Glu Gly Asn Ser Leu Arg Ile Pro Phe Asn Lys Leu Pro Gly Leu1300 1305 1310 Gly Asp Ser Val Ala Glu Ser Ile Ile Arg Ala Arg Glu GluLys Pro 1315 1320 1325 Phe Thr Ser Val Glu Asp Leu Met Lys Arg Thr LysVal Asn Lys Asn 1330 1335 1340 His Ile Glu Leu Met Lys Ser Leu Gly ValLeu Gly Asp Leu Pro Glu 1345 1350 1355 1360 Thr Glu Gln Phe Thr Leu Phe1365 139 567 DNA Thermatoga maritima 139 gtgctcgcca tgatatggaacgacaccgtt ttttgcgtcg tagacacaga aaccacggga 60 accgatccct ttgccggagaccggatagtt gaaatagccg ctgttcctgt cttcaagggg 120 aagatctaca gaaacaaagcgtttcactct ctcgtgaatc ccagaataag aatccctgcg 180 ctgattcaga aagttcacggtatcagcaac atggacatcg tggaagcgcc agacatggac 240 acagtttacg atcttttcagggattacgtg aagggaacgg tgctcgtgtt tcacaacgcc 300 aacttcgacc tcacttttctggatatgatg gcaaaggaaa cgggaaactt tccaataacg 360 aatccctaca tcgacacactcgatctttca gaagagatct ttggaaggcc tcattctctc 420 aaatggctct ccgaaagacttggaataaaa accacgatac ggcaccgtgc tcttccagat 480 gccctggtga ccgcaagagtttttgtgaag cttgttgaat ttcttggtga aaacagggtc 540 aacgaattca tacgtggaaaacggggg 567 140 189 PRT Thermatoga maritima 140 Met Leu Ala Met Ile TrpAsn Asp Thr Val Phe Cys Val Val Asp Thr 1 5 10 15 Glu Thr Thr Gly ThrAsp Pro Phe Ala Gly Asp Arg Ile Val Glu Ile 20 25 30 Ala Ala Val Pro ValPhe Lys Gly Lys Ile Tyr Arg Asn Lys Ala Phe 35 40 45 His Ser Leu Val AsnPro Arg Ile Arg Ile Pro Ala Leu Ile Gln Lys 50 55 60 Val His Gly Ile SerAsn Met Asp Ile Val Glu Ala Pro Asp Met Asp 65 70 75 80 Thr Val Tyr AspLeu Phe Arg Asp Tyr Val Lys Gly Thr Val Leu Val 85 90 95 Phe His Asn AlaAsn Phe Asp Leu Thr Phe Leu Asp Met Met Ala Lys 100 105 110 Glu Thr GlyAsn Phe Pro Ile Thr Asn Pro Tyr Ile Asp Thr Leu Asp 115 120 125 Leu SerGlu Glu Ile Phe Gly Arg Pro His Ser Leu Lys Trp Leu Ser 130 135 140 GluArg Leu Gly Ile Lys Thr Thr Ile Arg His Arg Ala Leu Pro Asp 145 150 155160 Ala Leu Val Thr Ala Arg Val Phe Val Lys Leu Val Glu Phe Leu Gly 165170 175 Glu Asn Arg Val Asn Glu Phe Ile Arg Gly Lys Arg Gly 180 185 1411434 DNA Thermatoga maritima 141 gtggaagttc tttacaggaa gtacaggccaaagacttttt ctgaggttgt caatcaggat 60 catgtgaaga aggcaataat cggtgctattcagaagaaca gcgtggccca cggatacata 120 ttcgccggtc cgaggggaac ggggaagactactcttgcca gaattctcgc aaaatccctg 180 aactgtgaga acagaaaggg agttgaaccctgcaattcct gcagagcctg cagagagata 240 gacgagggaa ccttcatgga cgtgatagagctcgacgcgg cctccaacag aggaatagac 300 gagatcagaa gaatcagaga cgccgttggatacaggccga tggaaggtaa atacaaagtc 360 tacataatag acgaagttca catgctcacgaaagaagcct tcaacgcgct cctcaaaaca 420 ctcgaagaac ctccttccca cgtcgtgttcgtgctggcaa cgacaaacct tgagaaggtt 480 cctcccacga ttatctcgag atgtcaggttttcgagttca gaaacattcc cgacgagctc 540 atcgaaaaga ggctccagga agttgcggaggctgaaggaa tagagataga cagggaagct 600 ctgagcttca tcgcaaaaag agcctctggaggcttgagag acgcgctcac catgctcgag 660 caggtgtgga agttctcgga aggaaagatagatctcgaga cggtacacag ggcgctcggg 720 ttgataccga tacaggttgt tcgcgattacgtgaacgcta tcttttctgg tgatgtgaaa 780 agggtcttca ccgttctcga cgacgtctattacagcggga aggactacga ggtgctcatt 840 caggaagcag tcgaggatct ggtcgaagacctggaaaggg agagaggggt ttaccaggtt 900 tcagcgaacg atatagttca ggtttcgagacaacttctga atcttctgag agagataaag 960 ttcgccgaag aaaaacgact cgtctgtaaagtgggttcgg cttacatagc gacgaggttc 1020 tccaccacaa acgttcagga aaacgatgtcagagaaaaaa acgataattc aaatgtacag 1080 cagaaagaag agaagaaaga aacggtgaaggcaaaagaag aaaaacagga agacagcgag 1140 ttcgagaaac gcttcaaaga actcatggaagaactgaaag aaaagggcga tctctctatc 1200 tttgtcgctc tcagcctctc agaggtgcagtttgacggag aaaaggtgat tatttctttt 1260 gattcatcga aagctatgca ttacgagttgatgaagaaaa aactgcctga gctggaaaac 1320 attttttcta gaaaactcgg gaaaaaagtagaagttgaac ttcgactgat gggaaaagaa 1380 gaaacaatcg agaaggtttc tcagaagatcctgagattgt ttgaacagga ggga 1434 142 478 PRT Thermatoga maritima 142 MetGlu Val Leu Tyr Arg Lys Tyr Arg Pro Lys Thr Phe Ser Glu Val 1 5 10 15Val Asn Gln Asp His Val Lys Lys Ala Ile Ile Gly Ala Ile Gln Lys 20 25 30Asn Ser Val Ala His Gly Tyr Ile Phe Ala Gly Pro Arg Gly Thr Gly 35 40 45Lys Thr Thr Leu Ala Arg Ile Leu Ala Lys Ser Leu Asn Cys Glu Asn 50 55 60Arg Lys Gly Val Glu Pro Cys Asn Ser Cys Arg Ala Cys Arg Glu Ile 65 70 7580 Asp Glu Gly Thr Phe Met Asp Val Ile Glu Leu Asp Ala Ala Ser Asn 85 9095 Arg Gly Ile Asp Glu Ile Arg Arg Ile Arg Asp Ala Val Gly Tyr Arg 100105 110 Pro Met Glu Gly Lys Tyr Lys Val Tyr Ile Ile Asp Glu Val His Met115 120 125 Leu Thr Lys Glu Ala Phe Asn Ala Leu Leu Lys Thr Leu Glu GluPro 130 135 140 Pro Ser His Val Val Phe Val Leu Ala Thr Thr Asn Leu GluLys Val 145 150 155 160 Pro Pro Thr Ile Ile Ser Arg Cys Gln Val Phe GluPhe Arg Asn Ile 165 170 175 Pro Asp Glu Leu Ile Glu Lys Arg Leu Gln GluVal Ala Glu Ala Glu 180 185 190 Gly Ile Glu Ile Asp Arg Glu Ala Leu SerPhe Ile Ala Lys Arg Ala 195 200 205 Ser Gly Gly Leu Arg Asp Ala Leu ThrMet Leu Glu Gln Val Trp Lys 210 215 220 Phe Ser Glu Gly Lys Ile Asp LeuGlu Thr Val His Arg Ala Leu Gly 225 230 235 240 Leu Ile Pro Ile Gln ValVal Arg Asp Tyr Val Asn Ala Ile Phe Ser 245 250 255 Gly Asp Val Lys ArgVal Phe Thr Val Leu Asp Asp Val Tyr Tyr Ser 260 265 270 Gly Lys Asp TyrGlu Val Leu Ile Gln Glu Ala Val Glu Asp Leu Val 275 280 285 Glu Asp LeuGlu Arg Glu Arg Gly Val Tyr Gln Val Ser Ala Asn Asp 290 295 300 Ile ValGln Val Ser Arg Gln Leu Leu Asn Leu Leu Arg Glu Ile Lys 305 310 315 320Phe Ala Glu Glu Lys Arg Leu Val Cys Lys Val Gly Ser Ala Tyr Ile 325 330335 Ala Thr Arg Phe Ser Thr Thr Asn Val Gln Glu Asn Asp Val Arg Glu 340345 350 Lys Asn Asp Asn Ser Asn Val Gln Gln Lys Glu Glu Lys Lys Glu Thr355 360 365 Val Lys Ala Lys Glu Glu Lys Gln Glu Asp Ser Glu Phe Glu LysArg 370 375 380 Phe Lys Glu Leu Met Glu Glu Leu Lys Glu Lys Gly Asp LeuSer Ile 385 390 395 400 Phe Val Ala Leu Ser Leu Ser Glu Val Gln Phe AspGly Glu Lys Val 405 410 415 Ile Ile Ser Phe Asp Ser Ser Lys Ala Met HisTyr Glu Leu Met Lys 420 425 430 Lys Lys Leu Pro Glu Leu Glu Asn Ile PheSer Arg Lys Leu Gly Lys 435 440 445 Lys Val Glu Val Glu Leu Arg Leu MetGly Lys Glu Glu Thr Ile Glu 450 455 460 Lys Val Ser Gln Lys Ile Leu ArgLeu Phe Glu Gln Glu Gly 465 470 475 143 1098 DNA Thermatoga maritima 143atgaaagtaa ccgtcacgac tcttgaattg aaagacaaaa taaccatcgc ctcaaaagcg 60ctcgcaaaga aatccgtgaa acccattctt gctggatttc ttttcgaagt gaaagatgga 120aatttctaca tctgcgcgac cgatctcgag accggagtca aagcaaccgt gaatgccgct 180gaaatctccg gtgaggcacg ttttgtggta ccaggagatg tcattcagaa gatggtcaag 240gttctcccag atgagataac ggaactttct ttagaggggg atgctcttgt tataagttct 300ggaagcaccg ttttcaggat caccaccatg cccgcggacg aatttccaga gataacgcct 360gccgagtctg gaataacctt cgaagttgac acttcgctcc tcgaggaaat ggttgaaaag 420gtcatcttcg ccgctgccaa agacgagttc atgcgaaatc tgaatggagt tttctgggaa 480ctccacaaga atcttctcag gctggttgca agtgatggtt tcagacttgc acttgctgaa 540gagcagatag aaaacgagga agaggcgagt ttcttgctct ctttgaagag catgaaagaa 600gttcaaaacg tgctggacaa cacaacggag ccgactataa cggtgaggta cgatggaaga 660agggtttctc tgtcgacaaa tgatgtagaa acggtgatga gagtggtcga cgctgaattt 720cccgattaca aaagggtgat ccccgaaact ttcaaaacga aagtggtggt ttccagaaaa 780gaactcaggg aatctttgaa gagggtgatg gtgattgcca gcaagggaag cgagtccgtg 840aagttcgaaa tagaagaaaa cgttatgaga cttgtgagca agagcccgga ttatggagaa 900gtggtcgatg aagttgaagt tcaaaaagaa ggggaagatc tcgtgatcgc tttcaacccg 960aagttcatcg aggacgtttt gaagcacatt gagactgaag aaatcgaaat gaacttcgtt 1020gattctacca gtccatgtca gataaatcca ctcgatattt ctggatacct ttacatagtg 1080atgcccatca gactggca 1098 144 366 PRT Thermatoga maritima 144 Met Lys ValThr Val Thr Thr Leu Glu Leu Lys Asp Lys Ile Thr Ile 1 5 10 15 Ala SerLys Ala Leu Ala Lys Lys Ser Val Lys Pro Ile Leu Ala Gly 20 25 30 Phe LeuPhe Glu Val Lys Asp Gly Asn Phe Tyr Ile Cys Ala Thr Asp 35 40 45 Leu GluThr Gly Val Lys Ala Thr Val Asn Ala Ala Glu Ile Ser Gly 50 55 60 Glu AlaArg Phe Val Val Pro Gly Asp Val Ile Gln Lys Met Val Lys 65 70 75 80 ValLeu Pro Asp Glu Ile Thr Glu Leu Ser Leu Glu Gly Asp Ala Leu 85 90 95 ValIle Ser Ser Gly Ser Thr Val Phe Arg Ile Thr Thr Met Pro Ala 100 105 110Asp Glu Phe Pro Glu Ile Thr Pro Ala Glu Ser Gly Ile Thr Phe Glu 115 120125 Val Asp Thr Ser Leu Leu Glu Glu Met Val Glu Lys Val Ile Phe Ala 130135 140 Ala Ala Lys Asp Glu Phe Met Arg Asn Leu Asn Gly Val Phe Trp Glu145 150 155 160 Leu His Lys Asn Leu Leu Arg Leu Val Ala Ser Asp Gly PheArg Leu 165 170 175 Ala Leu Ala Glu Glu Gln Ile Glu Asn Glu Glu Glu AlaSer Phe Leu 180 185 190 Leu Ser Leu Lys Ser Met Lys Glu Val Gln Asn ValLeu Asp Asn Thr 195 200 205 Thr Glu Pro Thr Ile Thr Val Arg Tyr Asp GlyArg Arg Val Ser Leu 210 215 220 Ser Thr Asn Asp Val Glu Thr Val Met ArgVal Val Asp Ala Glu Phe 225 230 235 240 Pro Asp Tyr Lys Arg Val Ile ProGlu Thr Phe Lys Thr Lys Val Val 245 250 255 Val Ser Arg Lys Glu Leu ArgGlu Ser Leu Lys Arg Val Met Val Ile 260 265 270 Ala Ser Lys Gly Ser GluSer Val Lys Phe Glu Ile Glu Glu Asn Val 275 280 285 Met Arg Leu Val SerLys Ser Pro Asp Tyr Gly Glu Val Val Asp Glu 290 295 300 Val Glu Val GlnLys Glu Gly Glu Asp Leu Val Ile Ala Phe Asn Pro 305 310 315 320 Lys PheIle Glu Asp Val Leu Lys His Ile Glu Thr Glu Glu Ile Glu 325 330 335 MetAsn Phe Val Asp Ser Thr Ser Pro Cys Gln Ile Asn Pro Leu Asp 340 345 350Ile Ser Gly Tyr Leu Tyr Ile Val Met Pro Ile Arg Leu Ala 355 360 365 145972 DNA Thermatoga maritima 145 atgccagtca cgtttctcac aggtactgcagaaactcaga aggaagaatt gataaagaaa 60 ctcctgaagg atggtaacgt ggagtacataaggatccatc cggaggatcc cgacaagatc 120 gatttcataa ggtctttact caggacaaagacgatctttt ccaacaagac gatcattgac 180 atcgtcaatt tcgatgagtg gaaagcacaggagcagaagc gtctcgttga acttttgaaa 240 aacgtaccgg aagacgttca tatcttcatccgttctcaaa aaacaggtgg aaagggagta 300 gcgctggagc ttccgaagcc atgggaaacggacaagtggc ttgagtggat agaaaagcgc 360 ttcagggaga atggtttgct catcgataaagatgcccttc agctgttttt ctccaaggtt 420 ggaacgaacg acctgatcat agaaagggagattgaaaaac tgaaagctta ttccgaggac 480 agaaagataa cggtagaaga cgtggaagaggtcgttttta cctatcagac tccgggatac 540 gatgattttt gctttgctgt ttccgaaggaaaaaggaagc tcgctcactc tcttctgtcg 600 cagctgtgga aaaccacaga gtccgtggtgattgccactg tccttgcgaa tcacttcttg 660 gatctcttca aaatcctcgt tcttgtgacaaagaaaagat actacacctg gcctgatgtg 720 tccagggtgt ccaaagagct gggaattcccgttcctcgtg tggctcgttt cctcggtttc 780 tcctttaaga cctggaaatt caaggtgatgaaccacctcc tctactacga tgtgaagaag 840 gttagaaaga tactgaggga tctctacgatctggacagag ccgtgaaaag cgaagaagat 900 ccaaaaccgt tcttccacga gttcatagaagaggtggcac tggatgtata ttctcttcag 960 agagatgaag aa 972 146 324 PRTThermatoga maritima 146 Met Pro Val Thr Phe Leu Thr Gly Thr Ala Glu ThrGln Lys Glu Glu 1 5 10 15 Leu Ile Lys Lys Leu Leu Lys Asp Gly Asn ValGlu Tyr Ile Arg Ile 20 25 30 His Pro Glu Asp Pro Asp Lys Ile Asp Phe IleArg Ser Leu Leu Arg 35 40 45 Thr Lys Thr Ile Phe Ser Asn Lys Thr Ile IleAsp Ile Val Asn Phe 50 55 60 Asp Glu Trp Lys Ala Gln Glu Gln Lys Arg LeuVal Glu Leu Leu Lys 65 70 75 80 Asn Val Pro Glu Asp Val His Ile Phe IleArg Ser Gln Lys Thr Gly 85 90 95 Gly Lys Gly Val Ala Leu Glu Leu Pro LysPro Trp Glu Thr Asp Lys 100 105 110 Trp Leu Glu Trp Ile Glu Lys Arg PheArg Glu Asn Gly Leu Leu Ile 115 120 125 Asp Lys Asp Ala Leu Gln Leu PhePhe Ser Lys Val Gly Thr Asn Asp 130 135 140 Leu Ile Ile Glu Arg Glu IleGlu Lys Leu Lys Ala Tyr Ser Glu Asp 145 150 155 160 Arg Lys Ile Thr ValGlu Asp Val Glu Glu Val Val Phe Thr Tyr Gln 165 170 175 Thr Pro Gly TyrAsp Asp Phe Cys Phe Ala Val Ser Glu Gly Lys Arg 180 185 190 Lys Leu AlaHis Ser Leu Leu Ser Gln Leu Trp Lys Thr Thr Glu Ser 195 200 205 Val ValIle Ala Thr Val Leu Ala Asn His Phe Leu Asp Leu Phe Lys 210 215 220 IleLeu Val Leu Val Thr Lys Lys Arg Tyr Tyr Thr Trp Pro Asp Val 225 230 235240 Ser Arg Val Ser Lys Glu Leu Gly Ile Pro Val Pro Arg Val Ala Arg 245250 255 Phe Leu Gly Phe Ser Phe Lys Thr Trp Lys Phe Lys Val Met Asn His260 265 270 Leu Leu Tyr Tyr Asp Val Lys Lys Val Arg Lys Ile Leu Arg AspLeu 275 280 285 Tyr Asp Leu Asp Arg Ala Val Lys Ser Glu Glu Asp Pro LysPro Phe 290 295 300 Phe His Glu Phe Ile Glu Glu Val Ala Leu Asp Val TyrSer Leu Gln 305 310 315 320 Arg Asp Glu Glu 147 936 DNA Thermatogamaritima 147 atgaacgatt tgatcagaaa gtacgctaaa gatcaactgg aaactttgaaaaggatcata 60 gaaaagtctg aaggaatatc catcctcata aatggagaag atctctcgtatccgagagaa 120 gtatcccttg aacttcccga gtacgtggag aaatttcccc cgaaggcctcggatgttctg 180 gagatagatc ccgaggggga gaacataggc atagacgaca tcagaacgataaaggacttc 240 ctgaactaca gccccgagct ctacacgaga aagtacgtga tagtccacgactgtgaaaga 300 atgacccagc aggcggcgaa cgcgtttctg aaggcccttg aagaaccaccagaatacgct 360 gtgatcgttc tgaacactcg ccgctggcat tatctactgc cgacgataaagagccgagtg 420 ttcagagtgg ttgtgaacgt tccaaaggag ttcagagatc tcgtgaaagagaaaatagga 480 gatctctggg aggaacttcc acttcttgag agagacttca aaacggctctcgaagcctac 540 aaacttggtg cggaaaaact ttctggattg atggaaagtc tcaaagttttggagacggaa 600 aaactcttga aaaaggtcct ttcaaaaggc ctcgaaggtt atctcgcatgtagggagctc 660 ctggagagat tttcaaaggt ggaatcgaag gaattctttg cgctttttgatcaggtgact 720 aacacgataa caggaaaaga cgcgtttctt ttgatccaga gactgacaagaatcattctc 780 cacgaaaaca catgggaaag cgttgaagat caaaaaagcg tgtctttcctcgattcaatt 840 ctcagggtga agatagcgaa tctgaacaac aaactcactc tgatgaacatcctcgcgata 900 cacagagaga gaaagagagg tgtcaacgct tggagc 936 148 311 PRTThermatoga maritima 148 Met Asn Asp Leu Ile Arg Lys Tyr Ala Lys Asp GlnLeu Glu Thr Leu 1 5 10 15 Lys Arg Ile Ile Glu Lys Ser Glu Gly Ile SerIle Leu Ile Asn Gly 20 25 30 Glu Asp Leu Ser Tyr Pro Arg Glu Val Ser LeuGlu Leu Pro Glu Tyr 35 40 45 Val Glu Lys Phe Pro Pro Lys Ala Ser Asp ValLeu Glu Ile Asp Pro 50 55 60 Glu Gly Glu Asn Ile Gly Ile Asp Asp Ile ArgThr Ile Lys Asp Phe 65 70 75 80 Leu Asn Tyr Ser Pro Glu Leu Tyr Thr ArgLys Tyr Val Ile Val His 85 90 95 Asp Cys Glu Arg Met Thr Gln Gln Ala AlaAsn Ala Phe Leu Lys Ala 100 105 110 Leu Glu Glu Pro Pro Glu Tyr Ala ValIle Val Leu Asn Thr Arg Arg 115 120 125 Trp His Tyr Leu Leu Pro Thr IleLys Ser Arg Val Phe Arg Val Val 130 135 140 Val Asn Val Pro Lys Glu PheArg Asp Leu Val Lys Glu Lys Ile Gly 145 150 155 160 Asp Leu Trp Glu GluLeu Pro Leu Leu Glu Arg Asp Phe Lys Thr Ala 165 170 175 Leu Glu Ala TyrLys Leu Gly Ala Glu Lys Leu Ser Gly Leu Met Glu 180 185 190 Ser Leu LysVal Leu Glu Thr Glu Lys Leu Leu Lys Lys Val Leu Ser 195 200 205 Lys GlyLeu Glu Gly Tyr Leu Ala Cys Arg Glu Leu Leu Glu Arg Phe 210 215 220 SerLys Val Glu Ser Lys Glu Phe Phe Ala Leu Phe Asp Gln Val Thr 225 230 235240 Asn Thr Ile Thr Gly Lys Asp Ala Phe Leu Leu Ile Gln Arg Leu Thr 245250 255 Arg Ile Ile Leu His Glu Asn Thr Trp Glu Ser Val Glu Asp Lys Ser260 265 270 Val Ser Phe Leu Asp Ser Ile Leu Arg Val Lys Ile Ala Asn LeuAsn 275 280 285 Asn Lys Leu Thr Leu Met Asn Ile Leu Ala Ile His Arg GluArg Lys 290 295 300 Arg Gly Val Asn Ala Trp Ser 305 310 149 423 DNAThermatoga maritima 149 atgtctttct tcaacaagat catactcata ggaagactcgtgagagatcc cgaagagaga 60 tacacgctca gcggaactcc agtcaccacc ttcaccatagcggtggacag ggttcccaga 120 aagaacgcgc cggacgacgc tcaaacgact gatttcttcaggatcgtcac ctttggaaga 180 ctggcagagt tcgctagaac ctatctcacc aaaggaaggctcgttctcgt cgaaggtgaa 240 atgagaatga gaagatggga aacacccact ggagaaaagagggtatctcc ggaggttgtc 300 gcaaacgttg ttagattcat ggacagaaaa cctgctgaaacagttagcga gactgaagag 360 gagctggaaa taccggaaga agacttttcc agcgataccttcagtgaaga tgaaccacca 420 ttt 423 150 141 PRT Thermatoga maritima 150Met Ser Phe Phe Asn Lys Ile Ile Leu Ile Gly Arg Leu Val Arg Asp 1 5 1015 Pro Glu Glu Arg Tyr Thr Leu Ser Gly Thr Pro Val Thr Thr Phe Thr 20 2530 Ile Ala Val Asp Arg Val Pro Arg Lys Asn Ala Pro Asp Asp Ala Gln 35 4045 Thr Thr Asp Phe Phe Arg Ile Val Thr Phe Gly Arg Leu Ala Glu Phe 50 5560 Ala Arg Thr Tyr Leu Thr Lys Gly Arg Leu Val Leu Val Glu Gly Glu 65 7075 80 Met Arg Met Arg Arg Trp Glu Thr Pro Thr Gly Glu Lys Arg Val Ser 8590 95 Pro Glu Val Val Ala Asn Val Val Arg Phe Met Asp Arg Lys Pro Ala100 105 110 Glu Thr Val Ser Glu Thr Glu Glu Glu Leu Glu Ile Pro Glu GluAsp 115 120 125 Phe Ser Ser Asp Thr Phe Ser Glu Asp Glu Pro Pro Phe 130135 140 151 1353 DNA Thermatoga maritima 151 atgcgtgttc ccccgcacaacttagaggcc gaagttgctg tgctcggaag catattgata 60 gatccgtcgg taataaacgacgttcttgaa attttgagcc acgaagattt ctatctgaaa 120 aaacaccaac acatcttcagagcgatggaa gagctttacg acgaaggaaa accggtggac 180 gtggtttccg tctgtgacaagcttcaaagc atgggaaaac tcgaggaagt aggtggagat 240 ctggaagtgg cccagctcgctgaggctgtg cccagttctg cacacgcact tcactacgcg 300 gagatcgtca aggaaaaatccattctgagg aaactcattg agatctccag aaaaatctca 360 gaaagtgcct acatggaagaagatgtggag atcctgctcg acaacgcaga aaagatgatc 420 ttcgagatct cagagatgaaaacgacaaaa tcctacgatc atctgagagg catcatgcac 480 cgggtgtttg aaaacctggagaacttcagg gaaagagcca accttataga acccggtgtg 540 ctcataacgg gactaccaacgggattcaaa agtctggaca aacagaccac agggttccac 600 agctccgatc tggtgataatagcagcgaga ccctccatgg gaaaaacctc cttcgcactc 660 tcaatagcga ggaacatggctgtcaatttc gaaatccccg tcggaatatt cagtctcgag 720 atgtccaagg aacagctcgctcaaagacta ctcagcatgg agtccggtgt ggatctttac 780 agcatcagaa caggatacctggatcaggag aagtgggaaa gactcacaat agcggcttct 840 aaactctaca aagcacccatagttgtggac gatgagtcac tcctcgatcc gcgatcgttg 900 agggcaaaag cgagaaggatgaaaaaagaa tacgatgtaa aagccatttt tgtcgactat 960 ctccagctca tgcacctgaaaggaagaaaa gaaagcagac agcaggagat atccgagatc 1020 tcgagatctc tgaagctccttgcgagggaa ctcgacatag tggtgatagc gctttcacag 1080 ctttcgaggg ccgtagaacagagagaagac aaaagaccga ggctgagtga cctcagggaa 1140 tccggtgcga tagaacaggacgcagacaca gtcatcttca tctacaggga ggaatattac 1200 aggagcaaaa aatccaaagaggaaagcaag cttcacgaac ctcacgaagc tgaaatcata 1260 ataggtaaac agagaaacggtcccgttgga acgatcactc tgatcttcga ccccagaacg 1320 gttacgttcc atgaagtcgatgtggtgcat tca 1353 152 451 PRT Thermatoga maritima 152 Met Arg Val ProPro His Asn Leu Glu Ala Glu Val Ala Val Leu Gly 1 5 10 15 Ser Ile LeuIle Asp Pro Ser Val Ile Asn Asp Val Leu Glu Ile Leu 20 25 30 Ser His GluAsp Phe Tyr Leu Lys Lys His Gln His Ile Phe Arg Ala 35 40 45 Met Glu GluLeu Tyr Asp Glu Gly Lys Pro Val Asp Val Val Ser Val 50 55 60 Cys Asp LysLeu Gln Ser Met Gly Lys Leu Glu Glu Val Gly Gly Asp 65 70 75 80 Leu GluVal Ala Gln Leu Ala Glu Ala Val Pro Ser Ser Ala His Ala 85 90 95 Leu HisTyr Ala Glu Ile Val Lys Glu Lys Ser Ile Leu Arg Lys Leu 100 105 110 IleGlu Ile Ser Arg Lys Ile Ser Glu Ser Ala Tyr Met Glu Glu Asp 115 120 125Val Glu Ile Leu Leu Asp Asn Ala Glu Lys Met Ile Phe Glu Ile Ser 130 135140 Glu Met Lys Thr Thr Lys Ser Tyr Asp His Leu Arg Gly Ile Met His 145150 155 160 Arg Val Phe Glu Asn Leu Glu Asn Phe Arg Glu Arg Ala Asn LeuIle 165 170 175 Glu Pro Gly Val Leu Ile Thr Gly Leu Pro Thr Gly Phe LysSer Leu 180 185 190 Asp Lys Gln Thr Thr Gly Phe His Ser Ser Asp Leu ValIle Ile Ala 195 200 205 Ala Arg Pro Ser Met Gly Lys Thr Ser Phe Ala LeuSer Ile Ala Arg 210 215 220 Asn Met Ala Val Asn Phe Glu Ile Pro Val GlyIle Phe Ser Leu Glu 225 230 235 240 Met Ser Lys Glu Gln Leu Ala Gln ArgLeu Leu Ser Met Glu Ser Gly 245 250 255 Val Asp Leu Tyr Ser Ile Arg ThrGly Tyr Leu Asp Gln Glu Lys Trp 260 265 270 Glu Arg Leu Thr Ile Ala AlaSer Lys Leu Tyr Lys Ala Pro Ile Val 275 280 285 Val Asp Asp Glu Ser LeuLeu Asp Pro Arg Ser Leu Arg Ala Lys Ala 290 295 300 Arg Arg Met Lys LysGlu Tyr Asp Val Lys Ala Ile Phe Val Asp Tyr 305 310 315 320 Leu Gln LeuMet His Leu Lys Gly Arg Lys Glu Ser Arg Gln Gln Glu 325 330 335 Ile SerGlu Ile Ser Arg Ser Leu Lys Leu Leu Ala Arg Glu Leu Asp 340 345 350 IleVal Val Ile Ala Leu Ser Gln Leu Ser Arg Ala Val Glu Gln Arg 355 360 365Glu Asp Lys Arg Pro Arg Leu Ser Asp Leu Arg Glu Ser Gly Ala Ile 370 375380 Glu Gln Asp Ala Asp Thr Val Ile Phe Ile Tyr Arg Glu Glu Tyr Tyr 385390 395 400 Arg Ser Lys Lys Ser Lys Glu Glu Ser Lys Leu His Glu Pro HisGlu 405 410 415 Ala Glu Ile Ile Ile Gly Lys Gln Arg Asn Gly Pro Val GlyThr Ile 420 425 430 Thr Leu Ile Phe Asp Pro Arg Thr Val Thr Phe His GluVal Asp Val 435 440 445 Val His Ser 450 153 1695 DNA Thermatoga maritima153 gtgattcctc gagaggtcat cgaggaaata aaagaaaagg ttgacatcgt agaggtcatt 60tccgagtacg tgaatcttac ccgggtaggt tcctcctaca gggctctctg tccctttcat 120tcagaaacca atccttcttt ctacgttcat ccgggtttga agatatacca ttgtttcggc 180tgcggtgcga gtggagacgt catcaaattt cttcaagaaa tggaagggat cagtttccag 240gaagcgctgg aaagacttgc caaaagagct gggattgatc tttctctcta cagaacagaa 300gggacttctg aatacggaaa atacattcgt ttgtacgaag aaacgtggaa aaggtacgtc 360aaagagctgg agaaatcgaa agaggcaaaa gactatttaa aaagcagagg cttctctgaa 420gaagatatag caaagttcgg ctttgggtac gtccccaaga gatccagcat ctctatagaa 480gttgcagaag gcatgaacat aacactggaa gaacttgtca gatacggtat cgcgctgaaa 540aagggtgatc gattcgttga tagattcgaa ggaagaatcg ttgttccaat aaagaacgac 600agtggtcata ttgtggcttt tggtgggcgt gctctcggca acgaagaacc gaagtatttg 660aactctccag agaccaggta tttttcgaag aagaagaccc tttttctctt cgatgaggcg 720aaaaaagtgg caaaagaggt tggttttttc gtcatcaccg aaggctactt cgacgcgctc 780gcattcagaa aggatggaat accaacggcg gtcgctgttc ttggggcgag tctttcaaga 840gaggcgattc taaaactttc ggcgtattcg aaaaacgtca tactgtgttt cgataatgac 900aaagcaggct tcagagccac tctcaaatcc ctcgaggatc tcctagacta cgaattcaac 960gtgcttgtgg caaccccctc tccttacaaa gacccagatg aactctttca gaaagaagga 1020gaaggttcat tgaaaaagat gctgaaaaac tcgcgttcgt tcgaatattt tctggtgacg 1080gctggtgagg tcttctttga caggaacagc cccgcgggtg tgagatccta cctttctttc 1140ctcaaaggtt gggtccaaaa gatgagaagg aaaggatatt tgaaacacat agaaaatctc 1200gtgaatgagg tttcatcttc tctccagata ccagaaaacc agattttgaa cttttttgaa 1260agcgacaggt ctaacactat gcctgttcat gagaccaagt cgtcaaaggt ttacgatgag 1320gggagaggac tggcttattt gtttttgaac tacgaggatt tgagggaaaa gattctggaa 1380ctggacttag aggtactgga agataaaaac gcgagggagt ttttcaagag agtctcactg 1440ggagaagatt tgaacaaagt catagaaaac ttcccaaaag agctgaaaga ctggattttt 1500gagacaatag aaagcattcc tcctccaaag gatcccgaga aattcctcgg tgacctctcc 1560gaaaagttga aaatccgacg gatagagaga cgtatcgcag aaatagatga tatgataaag 1620aaagcttcaa acgatgaaga aaggcgtctt cttctctcta tgaaagtgga tctcctcaga 1680aaaataaaga ggagg 1695 154 565 PRT Thermatoga maritima 154 Met Ile ProArg Glu Val Ile Glu Glu Ile Lys Glu Lys Val Asp Ile 1 5 10 15 Val GluVal Ile Ser Glu Tyr Val Asn Leu Thr Arg Val Gly Ser Ser 20 25 30 Tyr ArgAla Leu Cys Pro Phe His Ser Glu Thr Asn Pro Ser Phe Tyr 35 40 45 Val HisPro Gly Leu Lys Ile Tyr His Cys Phe Gly Cys Gly Ala Ser 50 55 60 Gly AspVal Ile Lys Phe Leu Gln Glu Met Glu Gly Ile Ser Phe Gln 65 70 75 80 GluAla Leu Glu Arg Leu Ala Lys Arg Ala Gly Ile Asp Leu Ser Leu 85 90 95 TyrArg Thr Glu Gly Thr Ser Glu Tyr Gly Lys Tyr Ile Arg Leu Tyr 100 105 110Glu Glu Thr Trp Lys Arg Tyr Val Lys Glu Leu Glu Lys Ser Lys Glu 115 120125 Ala Lys Asp Tyr Leu Lys Ser Arg Gly Phe Ser Glu Glu Asp Ile Ala 130135 140 Lys Phe Gly Phe Gly Tyr Val Pro Lys Arg Ser Ser Ile Ser Ile Glu145 150 155 160 Val Ala Glu Gly Met Asn Ile Thr Leu Glu Glu Leu Val ArgTyr Gly 165 170 175 Ile Ala Leu Lys Lys Gly Asp Arg Phe Val Asp Arg PheGlu Gly Arg 180 185 190 Ile Val Val Pro Ile Lys Asn Asp Ser Gly His IleVal Ala Phe Gly 195 200 205 Gly Arg Ala Leu Gly Asn Glu Glu Pro Lys TyrLeu Asn Ser Pro Glu 210 215 220 Thr Arg Tyr Phe Ser Lys Lys Lys Thr LeuPhe Leu Phe Asp Glu Ala 225 230 235 240 Lys Lys Val Ala Lys Glu Val GlyPhe Phe Val Ile Thr Glu Gly Tyr 245 250 255 Phe Asp Ala Leu Ala Phe ArgLys Asp Gly Ile Pro Thr Ala Val Ala 260 265 270 Val Leu Gly Ala Ser LeuSer Arg Glu Ala Ile Leu Lys Leu Ser Ala 275 280 285 Tyr Ser Lys Asn ValIle Leu Cys Phe Asp Asn Asp Lys Ala Gly Phe 290 295 300 Arg Ala Thr LeuLys Ser Leu Glu Asp Leu Leu Asp Tyr Glu Phe Asn 305 310 315 320 Val LeuVal Ala Thr Pro Ser Pro Tyr Lys Asp Pro Asp Glu Leu Phe 325 330 335 GlnLys Glu Gly Glu Gly Ser Leu Lys Lys Met Leu Lys Asn Ser Arg 340 345 350Ser Phe Glu Tyr Phe Leu Val Thr Ala Gly Glu Val Phe Phe Asp Arg 355 360365 Asn Ser Pro Ala Gly Val Arg Ser Tyr Leu Ser Phe Leu Lys Gly Trp 370375 380 Val Gln Lys Met Arg Arg Lys Gly Tyr Leu Lys His Ile Glu Asn Leu385 390 395 400 Val Asn Glu Val Ser Ser Ser Leu Gln Ile Pro Glu Asn GlnIle Leu 405 410 415 Asn Phe Phe Glu Ser Asp Arg Ser Asn Thr Met Pro ValHis Glu Thr 420 425 430 Lys Ser Ser Lys Val Tyr Asp Glu Gly Arg Gly LeuAla Tyr Leu Phe 435 440 445 Leu Asn Tyr Glu Asp Leu Arg Glu Lys Ile LeuGlu Leu Asp Leu Glu 450 455 460 Val Leu Glu Asp Lys Asn Ala Arg Glu PhePhe Lys Arg Val Ser Leu 465 470 475 480 Gly Glu Asp Leu Asn Lys Val IleGlu Asn Phe Pro Lys Glu Leu Lys 485 490 495 Asp Trp Ile Phe Glu Thr IleGlu Ser Ile Pro Pro Pro Lys Asp Pro 500 505 510 Glu Lys Phe Leu Gly AspLeu Ser Glu Lys Leu Lys Ile Arg Arg Ile 515 520 525 Glu Arg Arg Ile AlaGlu Ile Asp Asp Met Ile Lys Lys Ala Ser Asn 530 535 540 Asp Glu Glu ArgArg Leu Leu Leu Ser Met Lys Val Asp Leu Leu Arg 545 550 555 560 Lys IleLys Arg Arg 565 155 804 DNA Thermus thermophilus 155 atggctctacacccggctca ccctggggca ataatcgggc acgaggccgt tctcgccctc 60 cttccccgcctcaccgccca gaccctgctc ttctccggcc ccgagggggt ggggcggcgc 120 accgtggcccgctggtacgc ctgggggctc aaccgcggct tccccccgcc ctccctgggg 180 gagcacccggacgtcctcga ggtggggccc aaggcccggg acctccgggg ccgggccgag 240 gtgcggctggaggaggtggc gcccctcttg gagtggtgct ccagccaccc ccgggagcgg 300 gtgaaggtggccatcctgga ctcggcccac ctcctcaccg aggccgccgc caacgccctc 360 ctcaagctcctggaggagcc cccttcctac gcccgcatcg tcctcatcgc cccaagccgc 420 gccaccctcctccccaccct ggcctcccgg gccacggagg tggcattcgc ccccgtgccc 480 gaggaggccctgcgcgccct cacccaggac ccggagctcc tccgctacgc cgccggggcc 540 ccgggccgcctccttagggc cctccaggac ccggaggggt accgggcccg catggccagg 600 gcgcaaagggtcctgaaagc cccgcccctg gagcgcctcg ctttgcttcg ggagcttttg 660 gccgaggaggagggggtcca cgccctccac gccgtcctaa agcgcccgga gcacctcctt 720 gccctggagcgggcgcggga ggccctggag gggtacgtga gccccgagct ggtcctcgcc 780 cggctggccttagacttaga gaca 804 156 268 PRT Thermus thermophilus 156 Met Ala Leu HisPro Ala His Pro Gly Ala Ile Ile Gly His Glu Ala 1 5 10 15 Val Leu AlaLeu Leu Pro Arg Leu Thr Ala Gln Thr Leu Leu Phe Ser 20 25 30 Gly Pro GluGly Val Gly Arg Arg Thr Val Ala Arg Trp Tyr Ala Trp 35 40 45 Gly Leu AsnArg Gly Phe Pro Pro Pro Ser Leu Gly Glu His Pro Asp 50 55 60 Val Leu GluVal Gly Pro Lys Ala Arg Asp Leu Arg Gly Arg Ala Glu 65 70 75 80 Val ArgLeu Glu Glu Val Ala Pro Leu Leu Glu Trp Cys Ser Ser His 85 90 95 Pro ArgGlu Arg Val Lys Val Ala Ile Leu Asp Ser Ala His Leu Leu 100 105 110 ThrGlu Ala Ala Ala Asn Ala Leu Leu Lys Leu Leu Glu Glu Pro Pro 115 120 125Ser Tyr Ala Arg Ile Val Leu Ile Ala Pro Ser Arg Ala Thr Leu Leu 130 135140 Pro Thr Leu Ala Ser Arg Ala Thr Glu Val Ala Phe Ala Pro Val Pro 145150 155 160 Glu Glu Ala Leu Arg Ala Leu Thr Gln Asp Pro Glu Leu Leu ArgTyr 165 170 175 Ala Ala Gly Ala Pro Gly Arg Leu Leu Arg Ala Leu Gln AspPro Glu 180 185 190 Gly Tyr Arg Ala Arg Met Ala Arg Ala Gln Arg Val LeuLys Ala Pro 195 200 205 Pro Leu Glu Arg Leu Ala Leu Leu Arg Glu Leu LeuAla Glu Glu Glu 210 215 220 Gly Val His Ala Leu His Ala Val Leu Lys ArgPro Glu His Leu Leu 225 230 235 240 Ala Leu Glu Arg Ala Arg Glu Ala LeuGlu Gly Tyr Val Ser Pro Glu 245 250 255 Leu Val Leu Ala Arg Leu Ala LeuAsp Leu Glu Thr 260 265 157 729 DNA Thermus thermophilus 157 atgctggacctgagggaggt gggggaggcg gagtggaagg ccctaaagcc ccttttggaa 60 agcgtgcccgagggcgtccc cgtcctcctc ctggacccta agccaagccc ctcccgggcg 120 gccttctaccggaaccggga aaggcgggac ttccccaccc ccaaggggaa ggacctggtg 180 cggcacctggaaaaccgggc caagcgcctg gggctcaggc tcccgggcgg ggtggcccag 240 tacctggcctccctggaggg ggacctcgag gccctggagc gggagctgga gaagcttgcc 300 ctcctctccccacccctcac cctggagaag gtggagaagg tggtggccct gaggcccccc 360 ctcacgggctttgacctggt gcgctccgtc ctggagaagg accccaagga ggccctcctg 420 cgcctaggcggcctcaagga ggagggggag gagcccctca ggctcctcgg ggccctctcc 480 tggcagttcgccctcctcgc ccgggccttc ttcctcctcc gggaaaaccc caggcccaag 540 gaggaggacctcgcccgcct cgaggcccac ccctacgccg cccgccgcgc cctggaggcg 600 gcgaagcgcctcacggaaga ggccctcaag gaggccctgg acgccctcat ggaggcggaa 660 aagagggccaagggggggaa agacccgtgg ctcgccctgg aggcggcggt cctccgcctc 720 gcccgttga 729158 292 PRT Thermus thermophilus 158 Met Val Ile Ala Phe Thr Gly Asp ProPhe Leu Ala Arg Glu Ala Leu 1 5 10 15 Leu Glu Glu Ala Arg Leu Arg GlyLeu Ser Arg Phe Thr Glu Pro Thr 20 25 30 Pro Glu Ala Leu Ala Gln Ala LeuAla Pro Gly Leu Phe Gly Gly Gly 35 40 45 Gly Ala Met Leu Asp Leu Arg GluVal Gly Glu Ala Glu Trp Lys Ala 50 55 60 Leu Lys Pro Leu Leu Glu Ser ValPro Glu Gly Val Pro Val Leu Leu 65 70 75 80 Leu Asp Pro Lys Pro Ser ProSer Arg Ala Ala Phe Tyr Arg Asn Arg 85 90 95 Glu Arg Arg Asp Phe Pro ThrPro Lys Gly Lys Asp Leu Val Arg His 100 105 110 Leu Glu Asn Arg Ala LysArg Leu Gly Leu Arg Leu Pro Gly Gly Val 115 120 125 Ala Gln Tyr Leu AlaSer Leu Glu Gly Asp Leu Glu Ala Leu Glu Arg 130 135 140 Glu Leu Glu LysLeu Ala Leu Leu Ser Pro Pro Leu Thr Leu Glu Lys 145 150 155 160 Val GluLys Val Val Ala Leu Arg Pro Pro Leu Thr Gly Phe Asp Leu 165 170 175 ValArg Ser Val Leu Glu Lys Asp Pro Lys Glu Ala Leu Leu Arg Leu 180 185 190Gly Gly Leu Lys Glu Glu Gly Glu Glu Pro Leu Arg Leu Leu Gly Ala 195 200205 Leu Ser Trp Gln Phe Ala Leu Leu Ala Arg Ala Phe Phe Leu Leu Arg 210215 220 Glu Asn Pro Arg Pro Lys Glu Glu Asp Leu Ala Arg Leu Glu Ala His225 230 235 240 Pro Tyr Ala Ala Arg Arg Ala Leu Glu Ala Ala Lys Arg LeuThr Glu 245 250 255 Glu Ala Leu Lys Glu Ala Leu Asp Ala Leu Met Glu AlaGlu Lys Arg 260 265 270 Ala Lys Gly Gly Lys Asp Pro Trp Leu Ala Leu GluAla Ala Val Leu 275 280 285 Arg Leu Ala Arg 290 159 37 DNA ArtificialSequence Description of Artificial Sequence primer 159 gtgtgtcatatgagtaagga tttcgtccac cttcacc 37 160 34 DNA Artificial SequenceDescription of Artificial Sequence primer 160 gtgtgtggat ccggggactactcggaagta aggg 34 161 36 DNA Artificial Sequence Description ofArtificial Sequence primer 161 gtgtgtcata tggaaaccac aatattccag ttccag36 162 39 DNA Artificial Sequence Description of Artificial Sequenceprimer 162 gtgtgtggat ccttatccac catgagaagt atttttcac 39 163 41 DNAArtificial Sequence Description of Artificial Sequence primer 163gtgtgtcata tggaaaaagt tttttttgga aaaaactcca g 41 164 35 DNA ArtificialSequence Description of Artificial Sequence primer 164 gtgtgtggatccttaatccg cctgaacggc taacg 35 165 41 DNA Artificial SequenceDescription of Artificial Sequence primer 165 gtgtgtcata tgaactacgttcccttcgcg agaaagtaca g 41 166 36 DNA Artificial Sequence Description ofArtificial Sequence primer 166 gtgtgtggat ccttaaaaca gcctcgtccc gctgga36 167 33 DNA Artificial Sequence Description of Artificial Sequenceprimer 167 gtgtgtcata tgcgcgttaa ggtggacagg gag 33 168 35 DNA ArtificialSequence Description of Artificial Sequence primer 168 tgtgtctcgagtcatggcta caccctcatc ggcat 35 169 47 DNA Artificial SequenceDescription of Artificial Sequence primer 169 gtgtgtcata tgctcaataaggtttttata ataggaagac ttacggg 47 170 39 DNA Artificial SequenceDescription of Artificial Sequence primer 170 gtgtggatcc ttaaaaaggtatttcgtcct cttcatcgg 39 171 807 DNA Thermus thermophilus 171 atggctcgaggcctgaaccg cgttttcctc atcggcgccc tcgccacccg gccggacatg 60 cgctacaccccggcggggct cgccattttg gacctgaccc tcgccggtca ggacctgctt 120 ctttccgataacggggggga accggaggtg tcctggtacc accgggtgag gctcttaggc 180 cgccaggcggagatgtgggg cgacctcttg gaccaagggc agctcgtctt cgtggagggc 240 cgcctggagtaccgccagtg ggaaagggag ggggagaagc ggagcgagct ccagatccgg 300 gccgacttccggaccccctg gacgaccggg ggaagaagcg ggcggaggac agccggggcc 360 agcccaggctccgcgccgcc ctgaaccagg tcttcctcat gggcaacctg acccgggacc 420 cggaactccgctacaccccc cagggcaccg cggtggcccg gctgggcctg gcggtgaacg 480 agcgccgccagggggcggag gagcgcaccc acttcgtgga ggttcaggcc tggcgcgacc 540 tggcggagtgggccgccgag ctgaggaagg gcgacggcct tttcgtgatc ggcaggttgg 600 tgaacgactcctggaccagc tccagcggcg agcggcgctt ccagacccgt gtggaggccc 660 tcaggctggagcgccccacc cgtggacctg cccaggcctg cccaggccgg cggaacaggt 720 cccgcgaagtccagacgggt ggggtggaca ttgacgaagg cttggaagac tttccgccgg 780 aggaggatttgccgttttga gcacgaa 807 172 266 PRT Thermus thermophilus 172 Met Ala ArgGly Leu Asn Arg Val Phe Leu Ile Gly Ala Leu Ala Thr 1 5 10 15 Arg ProAsp Met Arg Tyr Thr Pro Ala Gly Leu Ala Ile Leu Asp Leu 20 25 30 Thr LeuAla Gly Gln Asp Leu Leu Leu Ser Asp Asn Gly Gly Glu Pro 35 40 45 Glu ValSer Trp Tyr His Arg Val Arg Leu Leu Gly Arg Gln Ala Glu 50 55 60 Met TrpGly Asp Leu Leu Asp Gln Gly Gln Leu Val Phe Val Glu Gly 65 70 75 80 ArgLeu Glu Tyr Arg Gln Trp Glu Arg Glu Gly Glu Lys Arg Ser Glu 85 90 95 LeuGln Ile Arg Ala Asp Phe Leu Asp Pro Leu Asp Asp Arg Gly Lys 100 105 110Lys Arg Ala Glu Asp Ser Arg Gly Gln Pro Arg Leu Arg Ala Ala Leu 115 120125 Asn Gln Val Phe Leu Met Gly Asn Leu Thr Arg Asp Pro Glu Leu Arg 130135 140 Tyr Thr Pro Gln Gly Thr Ala Val Ala Arg Leu Gly Leu Ala Val Asn145 150 155 160 Glu Arg Arg Gln Gly Ala Glu Glu Arg Thr His Phe Val GluVal Gln 165 170 175 Ala Trp Arg Asp Leu Ala Glu Trp Ala Ala Glu Leu ArgLys Gly Asp 180 185 190 Gly Leu Phe Val Ile Gly Arg Leu Val Asn Asp SerTrp Thr Ser Ser 195 200 205 Ser Gly Glu Arg Arg Phe Gln Thr Arg Val GluAla Leu Arg Leu Glu 210 215 220 Arg Pro Thr Arg Gly Pro Ala Gln Ala CysPro Gly Arg Arg Asn Arg 225 230 235 240 Ser Arg Glu Val Gln Thr Gly GlyVal Asp Ile Asp Glu Gly Leu Glu 245 250 255 Asp Phe Pro Pro Glu Glu AspLeu Pro Phe 260 265 173 992 DNA Bacillus stearothermophilus 173aattccgaca tttcaattga atcgtttatt ccgcttgaaa aagaaggcaa gttgctcgtt 60gatgtgaaaa gaccggggag catcgtactg caggcgcgct ttttctctga aatcgtgaaa 120aaactgccgc aacaaacggt ggaaatcgaa acggaagaca actttttgac gatcatccgc 180tcggggcact cagaattccg cctcaatggg ctaaacgccg acgaatatcc gcgcctgccg 240caaattgaag aagaaaacgt gtttcaaatc ccggctgatt tattgaaaac cgtgattcgg 300caaacggtgt tcgccgtttc tacatcggaa acgcgcccaa tcttgacagg tgtcaactgg 360aaagttgaac atggcgagct tgtctgcaca gcgaccgaca gtcatcgctt agccatgcgc 420aaagtgaaaa ttgagtcgga aaatgaagta tcatacaacg tcgtcatccc tggaaaaagt 480cttaatgagc tcagcaaaat tttggatgac ggcaaccacc cggtggacat cgtcatgaca 540gccaatcaag tgctatttaa ggccgagcac cttctcttct tttcccggct gcttgacggc 600aactatccgg agacggcccg cttgattcca acagaaagca aaacgaccat gatcgtcaat 660gcaaaagagt ttctgcaggc aatcgaccga gcgtccttgc ttgctcgaga aggaaggaac 720aacgttgtga aactgacgac gcttcctgga ggaatgctcg aaatttcttc gatttctccg 780agatcgggaa agtgacggag cagctgcaaa cggagtctct tgaaggggaa gagttgaaca 840tttcgttcag cgcgaaatat atgatggacg cgttgcgggc gcttgatgga acagacattt 900caaatcagct tcactggggc catgcggccg ttcctgttgc gcccgcttca accgattcga 960tgcttcagct cattttgccg gtgagaacat at 992 174 334 PRT Bacillusstearothermophilus 174 Asn Ser Asp Ile Ser Ile Ile Glu Ser Phe Ile ProLeu Glu Lys Glu 1 5 10 15 Gly Lys Leu Leu Val Asp Val Lys Arg Pro GlySer Ile Val Leu Gln 20 25 30 Ala Arg Phe Phe Ser Glu Ile Val Lys Lys LeuPro Gln Gln Thr Val 35 40 45 Glu Ile Glu Thr Glu Asp Asn Phe Leu Thr IleIle Arg Ser Gly His 50 55 60 Ser Glu Phe Arg Leu Asn Gly Leu Asn Ala AspGlu Tyr Pro Arg Leu 65 70 75 80 Pro Gln Ile Glu Glu Glu Asn Val Phe GlnIle Pro Ala Asp Leu Leu 85 90 95 Lys Thr Val Ile Arg Gln Thr Val Phe AlaVal Ser Thr Ser Glu Thr 100 105 110 Arg Pro Ile Leu Thr Gly Val Asn TrpLys Val Glu His Gly Glu Leu 115 120 125 Val Cys Thr Ala Thr Asp Ser HisArg Leu Ala Met Arg Lys Val Lys 130 135 140 Ile Ile Glu Ser Glu Asn GluVal Ser Tyr Asn Val Val Ile Pro Gly 145 150 155 160 Lys Ser Leu Asn GluLeu Ser Lys Ile Ile Leu Asp Asp Gly Asn His 165 170 175 Pro Val Asp IleVal Met Thr Ala Asn Gln Val Leu Phe Lys Ala Glu 180 185 190 His Leu LeuPhe Phe Ser Arg Leu Leu Asp Gly Asn Tyr Pro Glu Thr 195 200 205 Ala ArgLeu Ile Pro Thr Glu Ser Lys Thr Thr Met Ile Val Asn Ala 210 215 220 LysGlu Phe Leu Gln Ala Ile Asp Arg Ala Ser Leu Leu Ala Arg Glu 225 230 235240 Gly Arg Asn Asn Val Val Lys Leu Thr Thr Leu Pro Gly Gly Met Leu 245250 255 Glu Ile Ser Ser Ile Ser Pro Glu Ile Gly Lys Val Thr Glu Gln Leu260 265 270 Gln Thr Glu Ser Leu Glu Gly Glu Glu Leu Asn Ile Ser Phe SerAla 275 280 285 Lys Tyr Met Met Asp Ala Leu Arg Ala Leu Asp Gly Thr AspIle Gln 290 295 300 Ile Ser Phe Thr Gly Ala Met Arg Pro Phe Leu Leu ArgPro Leu His 305 310 315 320 Thr Asp Ser Met Leu Gln Leu Ile Leu Pro ValArg Thr Tyr 325 330 175 492 DNA Bacillus stearothermophilus 175atgattaacc gcgtcatttt ggtcggcagg ttaacgagag atccggagtt gcgttacact 60ccaagcggag tggctgttgc cacgtttacg ctcgcggtca accgtccgtt tacaaatcag 120cagggcgagc gggaaacgga ttttattcaa tgtgtcgttt ggcgccgcca ggcggaaaac 180gtcgccaact ttttgaaaaa ggggagcttg gctggtgtcg atggccgact gcaaacccgc 240agctatgaaa atcaagaagg tcggcgtgtg tacgtgacgg aagtggtggc tgatagcgtc 300caatttcttg agccgaaagg aacgagcgag cagcgagggg cgacagcagg cggctactat 360ggggatccat tcccattcgg gcaagatcag aaccaccaat atccgaacga aaaagggttt 420ggccgcatcg atgacgatcc tttcgccaat gacggccagc cgatcgatat ttctgatgat 480gatttgccgt tt 492 176 164 PRT Bacillus stearothermophilus 176 Met IleAsn Arg Val Ile Leu Val Gly Arg Leu Thr Arg Asp Pro Glu 1 5 10 15 LeuArg Tyr Thr Pro Ser Gly Val Ala Val Ala Thr Phe Thr Leu Ala 20 25 30 ValAsn Arg Pro Phe Thr Asn Gln Ser Tyr Glu Asn Gln Glu Gly Arg 35 40 45 ArgVal Tyr Val Thr Glu Val Val Ala Asp Ser Val Gln Phe Leu Glu 50 55 60 ProLys Gly Thr Ser Glu Gln Arg Gly Ala Thr Ala Gly Gly Tyr Tyr 65 70 75 80Gln Gly Glu Arg Glu Thr Asp Phe Ile Gln Cys Val Val Trp Arg Arg 85 90 95Gln Ala Glu Asn Val Ala Asn Phe Leu Lys Lys Gly Ser Leu Ala Gly 100 105110 Val Asp Gly Arg Leu Gln Thr Arg Gly Asp Pro Phe Pro Phe Gly Gln 115120 125 Asp Gln Asn His Gln Tyr Pro Asn Glu Lys Gly Phe Gly Arg Ile Asp130 135 140 Asp Asp Pro Phe Ala Asn Asp Gly Gln Pro Ile Asp Ile Ser AspAsp 145 150 155 160 Asp Leu Pro Phe 177 1044 DNA Bacillusstearothermophilus 177 atgctggaac gcgtatgggg aaacattgaa aaacggcgtttttctcccct ttatttatta 60 tacggcaatg agccgttttt attaacggaa acgtatgagcgattggtgaa cgcagcgctt 120 ggccccgagg agcgggagtg gaacttggct gtgtacgactgcgaggaaac gccgatcgag 180 gcggcgcttg aggaggccga gacggtgccg tttttcggcgagcggcgtgt cattctcatc 240 aagcatccat atttttttac gtctgaaaaa gagaaggagatcgaacatga tttggcgaag 300 ctggaggcgt acttgaaggc gccgtcgccg ttttcgatcgtcgtcttttt cgcgccgtac 360 gagaagcttg atgagcgaaa aaaaattacg aagctcgccaaagagcaaag cgaagtcgtc 420 atcgccgccc cgctcgccga agcggagctg cgtgcctgggtgcggcgccg catcgagagc 480 caaggggcgc aagcaagcga cgaggcgatt gatgtcctgttgcggcgggc cgggacgcag 540 ctttccgcct tggcgaatga aatcgataaa ttggccctgtttgccggatc gggcggaacc 600 atcgaggcgg cggcggttga gcggcttgtc gcccgcacgccggaagaaaa cgtatttgtg 660 cttgtcgagc aagtggcgaa gcgcgacatt ccagcagcgttgcagacgtt ttatgatctg 720 cttgaaaaca atgaagagcc gatcaaaatt ttggcgttgctcgccgccca tttccgcttg 780 ctttcgcaag tgaaatggct tgcctcctta ggctacggacaggcgcaaat tgctgcggcg 840 ctcaaggtgc acccgttccg cgtcaagctc gctcttgctcaagcggcccg cttcgctgac 900 ggagagcttg ctgaggcgat caacgagctc gctgacgccgattacgaagt gaaaagcggg 960 gcggtcgatc gccggttggc cgttgagctg cttctgatgcgctggggcgc ccgcccggcg 1020 caagcggggc gccacggccg gcgg 1044 178 348 PRTBacillus stearothermophilus 178 Met Leu Glu Arg Val Trp Gly Asn Ile GluLys Arg Arg Phe Ser Pro 1 5 10 15 Leu Tyr Leu Leu Tyr Gly Asn Glu ProPhe Leu Leu Thr Glu Thr Tyr 20 25 30 Glu Arg Leu Val Asn Ala Ala Leu GlyPro Glu Glu Arg Glu Trp Asn 35 40 45 Leu Ala Val Tyr Asp Cys Glu Glu ThrPro Ile Glu Ala Ala Leu Glu 50 55 60 Glu Ala Glu Thr Val Pro Phe Phe GlyGlu Arg Arg Val Ile Leu Ile 65 70 75 80 Lys His Pro Tyr Phe Phe Thr SerGlu Lys Glu Lys Glu Ile Glu His 85 90 95 Asp Leu Ala Lys Leu Glu Ala TyrLeu Lys Ala Pro Ser Pro Phe Ser 100 105 110 Ile Val Val Phe Phe Ala ProTyr Glu Lys Leu Asp Glu Arg Lys Lys 115 120 125 Ile Thr Lys Leu Ala LysGlu Gln Ser Glu Val Val Ile Ala Ala Pro 130 135 140 Leu Ala Glu Ala GluLeu Arg Ala Trp Val Arg Arg Arg Ile Glu Ser 145 150 155 160 Gln Gly AlaGln Ala Ser Asp Glu Ala Ile Asp Val Leu Leu Arg Arg 165 170 175 Ala GlyThr Gln Leu Ser Ala Leu Ala Asn Glu Ile Asp Lys Leu Ala 180 185 190 LeuPhe Ala Gly Ser Gly Gly Thr Ile Glu Ala Ala Ala Val Glu Arg 195 200 205Leu Val Ala Arg Thr Pro Glu Glu Asn Val Phe Val Leu Val Glu Gln 210 215220 Val Ala Lys Arg Asp Ile Pro Ala Ala Leu Gln Thr Phe Tyr Asp Leu 225230 235 240 Leu Glu Asn Asn Glu Glu Pro Ile Lys Ile Leu Ala Leu Leu AlaAla 245 250 255 His Phe Arg Leu Leu Ser Gln Val Lys Trp Leu Ala Ser LeuGly Tyr 260 265 270 Gly Gln Ala Gln Ile Ala Ala Ala Leu Lys Val His ProPhe Arg Val 275 280 285 Lys Leu Ala Leu Ala Gln Ala Ala Arg Phe Ala AspGly Glu Leu Ala 290 295 300 Glu Ala Ile Asn Glu Leu Ala Asp Ala Asp TyrGlu Val Lys Ser Gly 305 310 315 320 Ala Val Asp Arg Arg Leu Ala Val GluLeu Leu Leu Met Arg Trp Gly 325 330 335 Ala Arg Pro Ala Gln Ala Gly ArgHis Gly Arg Arg 340 345 179 757 DNA Bacillus stearothermophilus 179atgcgatggg aacagctagc gaaacgccag ccggtggtgg cgaaaatgct gcaaagcggc 60ttggaaaaag ggcggatttc tcatgcgtac ttgtttgagg ggcagcgggg gacgggcaaa 120aaagcggcca gtttgttgtt ggcgaaacgt ttgttttgtc tgtccccaat cggagtttcc 180ccgtgtctag agtgccgcaa ctgccggcgc atcgactccg gcaaccaccc tgacgtccgg 240gtgatcggcc cagatggagg atcaatcaaa aaggaacaaa tcgaatggct gcagcaagag 300ttctcgaaaa cagcggtcga gtcggataaa aaaatgtaca tcgttgagca cgccgatcaa 360atgacgacaa gcgctgccaa cagccttctg aaatttttgg aagagccgca tccggggacg 420gtggcggtat tgctgactga gcaataccac cgcctgctag ggacgatcgt ttcccgctgt 480caagtgcttt cgttccggcc gttgccgccg gcagagctcg cccagggact tgtcgaggag 540cacgtgccgt tgccgttggc gctgttggct gcccatttga caaacagctt cgaggaagca 600ctggcgcttg ccaaagatag ttggtttgcc gaggcgcgaa cattagtgct acaatggtat 660gagatgctgg gcaagccgga gctgcagctt ttgtttttca tccacgaccg cttgtttccg 720cattttttgg aaagccatca gcttgacctt ggacttg 757 180 252 PRT Bacillusstearothermophilus 180 Met Arg Trp Glu Gln Leu Ala Lys Arg Gln Pro ValVal Ala Lys Met 1 5 10 15 Leu Gln Ser Gly Leu Glu Lys Gly Arg Ile SerHis Ala Tyr Leu Phe 20 25 30 Glu Gly Gln Arg Gly Thr Gly Lys Lys Ala AlaSer Leu Leu Leu Ala 35 40 45 Lys Arg Leu Phe Cys Leu Ser Pro Ile Gly ValSer Pro Cys Leu Glu 50 55 60 Cys Arg Asn Cys Arg Arg Ile Asp Ser Gly AsnHis Pro Asp Val Arg 65 70 75 80 Val Ile Gly Pro Asp Gly Gly Ser Ile LysLys Glu Gln Ile Glu Trp 85 90 95 Leu Gln Gln Glu Phe Ser Lys Thr Ala ValGlu Ser Asp Lys Lys Met 100 105 110 Tyr Ile Val Glu His Ala Asp Gln MetThr Thr Ser Ala Ala Asn Ser 115 120 125 Leu Leu Lys Phe Leu Glu Glu ProHis Pro Gly Thr Val Ala Val Leu 130 135 140 Leu Thr Glu Gln Tyr His ArgLeu Leu Gly Thr Ile Val Ser Arg Cys 145 150 155 160 Gln Val Leu Ser PheArg Pro Leu Pro Pro Ala Glu Leu Ala Gln Gly 165 170 175 Leu Val Glu GluHis Val Pro Leu Pro Leu Ala Leu Leu Ala Ala His 180 185 190 Leu Thr AsnSer Phe Glu Glu Ala Leu Ala Leu Ala Lys Asp Ser Trp 195 200 205 Phe AlaGlu Ala Arg Thr Leu Val Leu Gln Trp Tyr Glu Met Leu Gly 210 215 220 LysPro Glu Leu Gln Leu Leu Phe Phe Ile His Asp Arg Leu Phe Pro 225 230 235240 His Phe Leu Glu Ser His Gln Leu Asp Leu Gly Leu 245 250 181 1677 DNABacillus stearothermophilus 181 gtggcatacc aagcgttata tcgcgtgtttcggccgcagc gctttgcgga catggtcggc 60 caagaacacg tgaccaagac gttgcaaagcgccctgcttc aacataaaat atcgcacgct 120 tacttatttt ccggcccgcg cggtacaggaaaaacgagcg cagcgaaaat tttcgccaag 180 gcggtcaact gtgaacaggc gccagcggcggagccatgca atgagtgtcc agcttgcctc 240 ggcattacga atggaacggt tcccgatgtgctggaaattg acgctgcttc caacaaccgc 300 gtcgatgaaa ttcgtgatat ccgtgagaaggtgaaatttg cgccaacgtc ggcccgctac 360 aaagtgtata tcatcgacga ggtgcatatgctgtcgatcg gtgcgtttaa cgcgctgttg 420 aaaacgttgg aggagccgcc gaaacacgtcattttcattt tggccacgac cgagccgcac 480 aaaattccgg cgacgatcat ttcccgctgccaacggttcg attttcgccg catcccgctt 540 caggcgatcg tttcacggct aaagtacgtcgcaagcgccc aaggtgtcga ggcgtcagat 600 gaggcattgt ccgccatcgc ccgtgctgcagacgggggga tgcgcgatgc gctcagcttg 660 cttgatcaag ccatttcgtt cagcgacgggaaacttcggc tcgacgacgt gctggcgatg 720 accggggctg catcatttgc cgccttatcgagcttcatcg aagccatcca ccgcaaagat 780 acagcggcgg ttcttcagca cttggaaacgatgatggcgc aagggaaaga tccgcatcgt 840 ttggttgaag acttgatttt gtactatcgcgatttattgc tgtacaaaac cgctccctat 900 gtggagggag cgattcaaat tgctgtcgttgacgaagcgt tcacttcact gtcggaaatg 960 attccggttt ccaatttata cgaggccatcgagttgctga acaaaagcca gcaagagatg 1020 aagtggacaa accacccgcg ccttctgttggaagtggcgc ttgtgaaact ttgccatcca 1080 tcagccgccg ccccgtcgct gtcggcttccgagttggaac cgttgataaa gcggattgaa 1140 acgctggagg cggaattgcg gcgcctgaaggaacaaccgc ctgcccctcc gtcgaccgcc 1200 gcgccggtga aaaaactgtc caaaccgatgaaaacggggg gatataaagc cccggttggc 1260 cgcatttacg agctgttgaa acaggcgacgcatgaagatt tagctttggt gaaaggatgc 1320 tgggcggatg tgctcgacac gttgaaacggcagcataaag tgtcgcacgc tgccttgctg 1380 caagagagcg agccggttgc agcgagcgcctcagcgtttg tattaaaatt caaatacgaa 1440 atccactgca aaatggcgac cgatcccacaagttcggtca aagaaaacgt cgaagcgatt 1500 ttgtttgagc tgacaaaccg ccgctttgaaatggtagcca ttccggaggg agaatgggga 1560 aaaataagag aagagttcat ccgcaataaggacgccatgg tggaaaaaag cgaagaagat 1620 ccgttaatcg ccgaagcgaa gcggctgtttggcgaagagc tgatcgaaat taaagaa 1677 182 559 PRT Bacillusstearothermophilus 182 Val Ala Tyr Gln Ala Leu Tyr Arg Val Phe Arg ProGln Arg Phe Ala 1 5 10 15 Asp Met Val Gly Gln Glu His Val Thr Lys ThrLeu Gln Ser Ala Leu 20 25 30 Leu Gln His Lys Ile Ser His Ala Tyr Leu PheSer Gly Pro Arg Gly 35 40 45 Thr Gly Lys Thr Ser Ala Ala Lys Ile Phe AlaLys Ala Val Asn Cys 50 55 60 Glu Gln Ala Pro Ala Ala Glu Pro Cys Asn GluCys Pro Ala Cys Leu 65 70 75 80 Gly Ile Thr Asn Gly Thr Val Pro Asp ValLeu Glu Ile Asp Ala Ala 85 90 95 Ser Asn Asn Arg Val Asp Glu Ile Arg AspIle Arg Glu Lys Val Lys 100 105 110 Phe Ala Pro Thr Ser Ala Arg Tyr LysVal Tyr Ile Ile Asp Glu Val 115 120 125 His Met Leu Ser Ile Gly Ala PheAsn Ala Leu Leu Lys Thr Leu Glu 130 135 140 Glu Pro Pro Lys His Val IlePhe Ile Leu Ala Thr Thr Glu Pro His 145 150 155 160 Lys Ile Pro Ala ThrIle Ile Ser Arg Cys Gln Arg Phe Asp Phe Arg 165 170 175 Arg Ile Pro LeuGln Ala Ile Val Ser Arg Leu Lys Tyr Val Ala Ser 180 185 190 Ala Gln GlyVal Glu Ala Ser Asp Glu Ala Leu Ser Ala Ile Ala Arg 195 200 205 Ala AlaAsp Gly Gly Met Arg Asp Ala Leu Ser Leu Leu Asp Gln Ala 210 215 220 IleSer Phe Ser Asp Gly Lys Leu Arg Leu Asp Asp Val Leu Ala Met 225 230 235240 Thr Gly Ala Ala Ser Phe Ala Ala Leu Ser Ser Phe Ile Glu Ala Ile 245250 255 His Arg Lys Asp Thr Ala Ala Val Leu Gln His Leu Glu Thr Met Met260 265 270 Ala Gln Gly Lys Asp Pro His Arg Leu Val Glu Asp Leu Ile LeuTyr 275 280 285 Tyr Arg Asp Leu Leu Leu Tyr Lys Thr Ala Pro Tyr Val GluGly Ala 290 295 300 Ile Gln Ile Ala Val Val Asp Glu Ala Phe Thr Ser LeuSer Glu Met 305 310 315 320 Ile Pro Val Ser Asn Leu Tyr Glu Ala Ile GluLeu Leu Asn Lys Ser 325 330 335 Gln Gln Glu Met Lys Trp Thr Asn His ProArg Leu Leu Leu Glu Val 340 345 350 Ala Leu Val Lys Leu Cys His Pro SerAla Ala Ala Pro Ser Leu Ser 355 360 365 Ala Ser Glu Leu Glu Pro Leu IleLys Arg Ile Glu Thr Leu Glu Ala 370 375 380 Glu Leu Arg Arg Leu Lys GluGln Pro Pro Ala Pro Pro Ser Thr Ala 385 390 395 400 Ala Pro Val Lys LysLeu Ser Lys Pro Met Lys Thr Gly Gly Tyr Lys 405 410 415 Ala Pro Val GlyArg Ile Tyr Glu Leu Leu Lys Gln Ala Thr His Glu 420 425 430 Asp Leu AlaLeu Val Lys Gly Cys Trp Ala Asp Val Leu Asp Thr Leu 435 440 445 Lys ArgGln His Lys Val Ser His Ala Ala Leu Leu Gln Glu Ser Glu 450 455 460 ProVal Ala Ala Ser Ala Ser Ala Phe Val Leu Lys Phe Lys Tyr Glu 465 470 475480 Ile His Cys Lys Met Ala Thr Asp Pro Thr Ser Ser Val Lys Glu Asn 485490 495 Val Glu Ala Ile Leu Phe Glu Leu Thr Asn Arg Arg Phe Glu Met Val500 505 510 Ala Ile Pro Glu Gly Glu Trp Gly Lys Ile Arg Glu Glu Phe IleArg 515 520 525 Asn Lys Asp Ala Met Val Glu Lys Ser Glu Glu Asp Pro LeuIle Ala 530 535 540 Glu Ala Lys Arg Leu Phe Gly Glu Glu Leu Ile Glu IleLys Glu 545 550 555 183 4301 DNA Bacillus stearothermophilus 183atggtgacaa aagagcaaaa agagcggttt ctcatcctgc ttgagcagct gaagatgacg 60tcggacgaat ggatgccgca ttttcgtgag gcagccattc gcaaagtcgt gatcgataaa 120gaggagaaaa gctggcattt ttattttcag ttcgacaacg tgctgccggt tcatgtatac 180aaaacgtttg ccgatcggct gcagacggcg ttccgccata tcgccgccgt ccgccatacg 240atggaggtcg aagcgccgcg cgtaactgag gcggatgtgc aggcgtattg gccgctttgc 300cttgccgagc tgcaagaagg catgtcgccg cttgtcgatt ggctcagccg gcagacgcct 360gagctgaaag gaaacaagct gcttgtcgtt gcccgccatg aagcggaagc gctggcgatc 420aaacggcggt tcgccaaaaa aatcgctgat gtgtacgctt cgtttgggtt tccccccctt 480cagcttgacg tcagcgtcga gccgtccaag caagaaatgg aacagttttt ggcgcaaaaa 540cagcaagagg acgaagagcg agcgcttgct gtactgaccg atttagcgag ggaagaagaa 600aaggccgcgt ctgcgccgcc gtccggtccg cttgtcatcg gctatccgat ccgcgacgag 660gagccggtgc ggcggcttga aacgatcgtc gaagaagagc ggcgcgtcgt tgtgcaaggc 720tatgtatttg acgccgaagt gagcgaatta aaaagcggcc gcacgctgtt gaccatgaaa 780atcacagatt acacgaactc gattttagtc aaaatgttct cgcgcgacaa agaggacgcc 840gagcttatga gcggcgtcaa aaaaggcatg tgggtgaaag tgcgcggcag cgtgcaaaac 900gatacgttcg tccgtgattt ggtcatcatc gccaacgatt tgaacgaaat cgccgcaaac 960gaacggcaag atacggcgcc ggaaggggaa aagagggtcg agctccattt gcataccccg 1020atgagccaaa tggacgcggt cacctcggtg acaaaactca ttgagcaagc gaaaaaatgg 1080gggcatccgg cgatcgccgt caccgaccat gccgttgttc agtcgtttcc ggaggcctac 1140agcgcggcga aaaaacacgg catgaaggtc atttacggcc ttgaggcgaa catcgtcgac 1200gatggcgtgc cgatcgccta caatgagacg caccgccgtc tttcggagga aacgtacgtc 1260gtctttgacg tcgagacgac gggcctgtcg gctgtgtaca atacgatcat tgagctggcg 1320gcggtgaaag tgaaagacgg cgagatcatc gaccgattca tgtcgtttgc caaccctgga 1380catccgttgt cggtgacaac gatggagctg actgggatca ccgatgagat ggtgaaagac 1440gccccgaagc cggacgaggt gctagcccgt tttgttgact gggccggcga tgcgacgctt 1500gttgcccaca acgccagctt tgacatcggt tttttaaacg cgggcctcgc tcgcatgggg 1560cgcggcaaaa tcgcgaatcc agtcatcgat acgctcgagc tggcccgttt tttatacccg 1620gatttgaaaa accatcggct caatacattg tgcaaaaaat ttgacattga attgacgcag 1680catcaccgcg ccatctacga cgcggaggcg accgggcatt tgcttatgcg gctgttgaag 1740gaagcggaag agcgcggcat actgtttcat gacgaattaa acagccgcac gcacagcgaa 1800gcgtcctatc ggcttgcgcg cccgttccat gtgacgctgt tggcgcaaaa cgagactgga 1860ttgaaaaatt tgttcaagct tgtgtcattg tcgcacattc aatattttca ccgtgtgccg 1920cgcatcccgc gctccgtgct cgtcaagcac cgcgacggcc tgcttgtcgg ctcgggctgc 1980gacaaaggag agctgtttga caacttgatc caaaaggcgc cggaagaagt cgaagacatc 2040gcccgttttt acgattttct tgaagtgcat ccgccggacg tgtacaagcc gctcatcgag 2100atggattatg tgaaagacga agagatgatc aaaaacatca tccgcagcat cgtcgccctt 2160ggtgagaagc ttgacatccc ggttgtcgcc actggcaacg tccattactt gaacccagaa 2220gataaaattt accggaaaat cttaatccat tcgcaaggcg gggcgaatcc gctcaaccgc 2280catgaactgc cggatgtata tttccgtacg acgaatgaaa tgcttgactg cttctcgttt 2340ttagggccgg aaaaagcgaa ggaaatcgtc gttgacaaca cgcaaaaaat cgcttcgtta 2400atcggcgatg tcaagccgat caaagatgag ctgtatacgc cgcgcattga aggggcggac 2460gaggaaatca gggaaatgag ctaccggcgg gcgaaggaaa tttacggcga cccgttgccg 2520aaacttgttg aagagcggct tgagaaggag ctaaaaagca tcatcggcca tggctttgcc 2580gtcatttatt tgatctcgca caagcttgtg aaaaaatcgc tcgatgacgg ctaccttgtc 2640gggtcgcgcg gatcggtcgg ctcgtcgttt gtcgcgacga tgacggaaat caccgaggtc 2700aatccgctgc cgccgcatta cgtttgcccg aactgcaagc attcggagtt ctttaacgac 2760ggttcagtcg gctcagggtt tgatttgccg gataaaaact gcccgcgatg tgggacgaaa 2820tacaagaaag acgggcacga catcccgttt gagacgtttc tcggctttaa aggcgacaaa 2880gtgccggata tcgacttgaa cttttccggc gaataccagc cgcgcgccca caactatacg 2940aaagtgctgt ttggcgaaga caacgtctac cgcgccggga cgattggcac ggtcgctgac 3000aaaacggcgt acggatttgt caaagcgtat gcgagcgacc ataacttaga gctgcgcggc 3060gcggaaatcg acggctcgcg gctggctgca ccggggtgaa gcggacgacc gggcagcatc 3120cgggcggcat catcgtcgtc ccggattata tggaaattta cgattttacg ccgattcaat 3180atccggccga tgacacgtcc tctgaatggc ggacgaccca tttcgacttc cattcgatcc 3240acgacaattt gttgaagctc gatattctcg ggcacgacga tccgacggtc attcgcatgc 3300tgcaagattt aagcggcatc gatccgaaaa cgatcccgac cgacgacccg gatgtgatgg 3360gcattttcag cagcaccgag ccgcttggcg ttacgccgga gcaaatcatg tgcaatgtcg 3420gcacgatcgg cattccggag tttggcacgc gcttcgttcg gcaaatgttg gaagagacaa 3480ggccaaaaac gttttccgaa ctcgtgcaaa tttccggctt gtcgcacggc accgatgtgt 3540ggctcggcaa cgcgcaagag ctcattcaaa acggcacgtg tacgttatcg gaagtcatcg 3600gctgccgcga cgacattatg gtctatttga tttaccgcgg gctcgagccg tcgctcgctt 3660ttaaaatcat ggaatccgtg cgcaaaggaa aaggcttaac gccggagttt gaagcagaaa 3720tgcgcaaaca tgacgtgccg gagtggtaca tcgattcatg caaaaaaatc aagtacatgt 3780tcccgaaagc gcacgccgcc gcctacgtgt taatggcggt gcgcatcgcc tactttaagg 3840tgcaccatcc gcttttgtat tacgcgtcgt actttacggt gcgggcggag gactttgacc 3900ttgacgccat gatcaaagga tcacccgcca ttcgcaagcg gattgaggaa atcaacgcca 3960aaggcattca ggcgacggcg aaagaaaaaa gcttgctcac ggttcttgag gtggccttag 4020agatgtgcga gcgcggcttt tcctttaaaa atatcgattt gtaccgctcg caggcgacgg 4080aattcgtcat tgacggcaat tctctcattc cgccgttcaa cgccattccg gggcttggga 4140cgaacgtggc gcaggcgatc gtgcgcgccc gcgaggaagg cgagtttttg tcgaaggagg 4200atttgcaaca gcgcggcaaa ttgtcgaaaa cgctgctcga gtatctagaa agccgcggct 4260gccttgactc gcttccagac cataaccagc tgtcgctgtt t 4301 184 1433 PRT Bacillusstearothermophilus 184 Met Val Thr Lys Glu Gln Lys Glu Arg Phe Leu IleLeu Leu Glu Gln 1 5 10 15 Leu Lys Met Thr Ser Asp Glu Trp Met Pro HisPhe Arg Glu Ala Ala 20 25 30 Ile Arg Lys Val Val Ile Asp Lys Glu Glu LysSer Trp His Phe Tyr 35 40 45 Phe Gln Phe Asp Asn Val Leu Pro Val His ValTyr Lys Thr Phe Ala 50 55 60 Asp Arg Leu Gln Thr Ala Phe Arg His Ile AlaAla Val Arg His Thr 65 70 75 80 Met Glu Val Glu Ala Pro Arg Val Thr GluAla Asp Val Gln Ala Tyr 85 90 95 Trp Pro Leu Cys Leu Ala Glu Leu Gln GluGly Met Ser Pro Leu Val 100 105 110 Asp Trp Leu Ser Arg Gln Thr Pro GluLeu Lys Gly Asn Lys Leu Leu 115 120 125 Val Val Ala Arg His Glu Ala GluAla Leu Ala Ile Lys Arg Arg Phe 130 135 140 Ala Lys Lys Ile Ala Asp ValTyr Ala Ser Phe Gly Phe Pro Pro Leu 145 150 155 160 Gln Leu Asp Val SerVal Glu Pro Ser Lys Gln Glu Met Glu Gln Phe 165 170 175 Leu Ala Gln LysGln Gln Glu Asp Glu Glu Arg Ala Leu Ala Val Leu 180 185 190 Thr Asp LeuAla Arg Glu Glu Glu Lys Ala Ala Ser Ala Pro Pro Ser 195 200 205 Gly ProLeu Val Ile Gly Tyr Pro Ile Arg Asp Glu Glu Pro Val Arg 210 215 220 ArgLeu Glu Thr Ile Val Glu Glu Glu Arg Arg Val Val Val Gln Gly 225 230 235240 Tyr Val Phe Asp Ala Glu Val Ser Glu Leu Lys Ser Gly Arg Thr Leu 245250 255 Leu Thr Met Lys Ile Thr Asp Tyr Thr Asn Ser Ile Leu Val Lys Met260 265 270 Phe Ser Arg Asp Lys Glu Asp Ala Glu Leu Met Ser Gly Val LysLys 275 280 285 Gly Met Trp Val Lys Val Arg Gly Ser Val Gln Asn Asp ThrPhe Val 290 295 300 Arg Asp Leu Val Ile Ile Ala Asn Asp Leu Asn Glu IleAla Ala Asn 305 310 315 320 Glu Arg Gln Asp Thr Ala Pro Glu Gly Glu LysArg Val Glu Leu His 325 330 335 Leu His Thr Pro Met Ser Gln Met Asp AlaVal Thr Ser Val Thr Lys 340 345 350 Leu Ile Glu Gln Ala Lys Lys Trp GlyHis Pro Ala Ile Ala Val Thr 355 360 365 Asp His Ala Val Val Gln Ser PhePro Glu Ala Tyr Ser Ala Ala Lys 370 375 380 Lys His Gly Met Lys Val IleTyr Gly Leu Glu Ala Asn Ile Val Asp 385 390 395 400 Asp Gly Val Pro IleAla Tyr Asn Glu Thr His Arg Arg Leu Ser Glu 405 410 415 Glu Thr Tyr ValVal Phe Asp Val Glu Thr Thr Gly Leu Ser Ala Val 420 425 430 Tyr Asn ThrIle Ile Glu Leu Ala Ala Val Lys Val Lys Asp Gly Glu 435 440 445 Ile IleAsp Arg Phe Met Ser Phe Ala Asn Pro Gly His Pro Leu Ser 450 455 460 ValThr Thr Met Glu Leu Thr Gly Ile Thr Asp Glu Met Val Lys Asp 465 470 475480 Ala Pro Lys Pro Asp Glu Val Leu Ala Arg Phe Val Asp Trp Ala Gly 485490 495 Asp Ala Thr Leu Val Ala His Asn Ala Ser Phe Asp Ile Gly Phe Leu500 505 510 Asn Ala Gly Leu Ala Arg Met Gly Arg Gly Lys Ile Ala Asn ProVal 515 520 525 Ile Asp Thr Leu Glu Leu Ala Arg Phe Leu Tyr Pro Asp LeuLys Asn 530 535 540 His Arg Leu Asn Thr Leu Cys Lys Lys Phe Asp Ile GluLeu Thr Gln 545 550 555 560 His His Arg Ala Ile Tyr Asp Ala Glu Ala ThrGly His Leu Leu Met 565 570 575 Arg Leu Leu Lys Glu Ala Glu Glu Arg GlyIle Leu Phe His Asp Glu 580 585 590 Leu Asn Ser Arg Thr His Ser Glu AlaSer Tyr Arg Leu Ala Arg Pro 595 600 605 Phe His Val Thr Leu Leu Ala GlnAsn Glu Thr Gly Leu Lys Asn Leu 610 615 620 Phe Lys Leu Val Ser Leu SerHis Ile Gln Tyr Phe His Arg Val Pro 625 630 635 640 Arg Ile Pro Arg SerVal Leu Val Lys His Arg Asp Gly Leu Leu Val 645 650 655 Gly Ser Gly CysAsp Lys Gly Glu Leu Phe Asp Asn Leu Ile Gln Lys 660 665 670 Ala Pro GluGlu Val Glu Asp Ile Ala Arg Phe Tyr Asp Phe Leu Glu 675 680 685 Val HisPro Pro Asp Val Tyr Lys Pro Leu Ile Glu Met Asp Tyr Val 690 695 700 LysAsp Glu Glu Met Ile Lys Asn Ile Ile Arg Ser Ile Val Ala Leu 705 710 715720 Gly Glu Lys Leu Asp Ile Pro Val Val Ala Thr Gly Asn Val His Tyr 725730 735 Leu Asn Pro Glu Asp Lys Ile Tyr Arg Lys Ile Leu Ile His Ser Gln740 745 750 Gly Gly Ala Asn Pro Leu Asn Arg His Glu Leu Pro Asp Val TyrPhe 755 760 765 Arg Thr Thr Asn Glu Met Leu Asp Cys Phe Ser Phe Leu GlyPro Glu 770 775 780 Lys Ala Lys Glu Ile Val Val Asp Asn Thr Gln Lys IleAla Ser Leu 785 790 795 800 Ile Gly Asp Val Lys Pro Ile Lys Asp Glu LeuTyr Thr Pro Arg Ile 805 810 815 Glu Gly Ala Asp Glu Glu Ile Arg Glu MetSer Tyr Arg Arg Ala Lys 820 825 830 Glu Ile Tyr Gly Asp Pro Leu Pro LysLeu Val Glu Glu Arg Leu Glu 835 840 845 Lys Glu Leu Lys Ser Ile Ile GlyHis Gly Phe Ala Val Ile Tyr Leu 850 855 860 Ile Ser His Lys Leu Val LysLys Ser Leu Asp Asp Gly Tyr Leu Val 865 870 875 880 Gly Ser Arg Gly SerVal Gly Ser Ser Phe Val Ala Thr Met Thr Glu 885 890 895 Ile Thr Glu ValAsn Pro Leu Pro Pro His Tyr Val Cys Pro Asn Cys 900 905 910 Lys His SerGlu Phe Phe Asn Asp Gly Ser Val Gly Ser Gly Phe Asp 915 920 925 Leu ProAsp Lys Asn Cys Pro Arg Cys Gly Thr Lys Tyr Lys Lys Asp 930 935 940 GlyHis Asp Ile Pro Phe Glu Thr Phe Leu Gly Phe Lys Gly Asp Lys 945 950 955960 Val Pro Asp Ile Asp Leu Asn Phe Ser Gly Glu Tyr Gln Pro Arg Ala 965970 975 His Asn Tyr Thr Lys Val Leu Phe Gly Glu Asp Asn Val Tyr Arg Ala980 985 990 Gly Thr Ile Gly Thr Val Ala Asp Lys Thr Ala Tyr Gly Phe ValLys 995 1000 1005 Ala Tyr Ala Ser Asp His Asn Leu Glu Leu Arg Gly AlaGlu Ile Asp 1010 1015 1020 Leu Ala Ala Gly Cys Thr Gly Val Lys Arg ThrThr Gly Gln His Pro 1025 1030 1035 1040 Gly Gly Ile Ile Val Val Pro AspTyr Met Glu Ile Tyr Asp Phe Thr 1045 1050 1055 Pro Ile Gln Tyr Pro AlaAsp Asp Thr Ser Ser Glu Trp Arg Thr Thr 1060 1065 1070 His Phe Asp PheHis Ser Ile His Asp Asn Leu Leu Lys Leu Asp Ile 1075 1080 1085 Leu GlyHis Asp Asp Pro Thr Val Ile Arg Met Leu Gln Asp Leu Ser 1090 1095 1100Gly Ile Asp Pro Lys Thr Ile Pro Thr Asp Asp Pro Asp Val Met Gly 11051110 1115 1120 Ile Phe Ser Ser Thr Glu Pro Leu Gly Val Thr Pro Glu GlnIle Met 1125 1130 1135 Cys Asn Val Gly Thr Ile Gly Ile Pro Glu Phe GlyThr Arg Phe Val 1140 1145 1150 Arg Gln Met Leu Glu Glu Thr Arg Pro LysThr Phe Ser Glu Leu Val 1155 1160 1165 Gln Ile Ser Gly Leu Ser His GlyThr Asp Val Trp Leu Gly Asn Ala 1170 1175 1180 Gln Glu Leu Ile Gln AsnGly Thr Cys Thr Leu Ser Glu Val Ile Gly 1185 1190 1195 1200 Cys Arg AspAsp Ile Met Val Tyr Leu Ile Tyr Arg Gly Leu Glu Pro 1205 1210 1215 SerLeu Ala Phe Lys Ile Met Glu Ser Val Arg Lys Gly Lys Gly Leu 1220 12251230 Thr Pro Glu Phe Glu Ala Glu Met Arg Lys His Asp Val Pro Glu Trp1235 1240 1245 Tyr Ile Asp Ser Cys Lys Lys Ile Lys Tyr Met Phe Pro LysAla His 1250 1255 1260 Ala Ala Ala Tyr Val Leu Met Ala Val Arg Ile AlaTyr Phe Lys Val 1265 1270 1275 1280 His His Pro Leu Leu Tyr Tyr Ala SerTyr Phe Thr Val Arg Ala Glu 1285 1290 1295 Asp Phe Asp Leu Asp Ala MetIle Lys Gly Ser Pro Ala Ile Arg Lys 1300 1305 1310 Arg Ile Glu Glu IleAsn Ala Lys Gly Ile Gln Ala Thr Ala Lys Glu 1315 1320 1325 Lys Ser LeuLeu Thr Val Leu Glu Val Ala Leu Glu Met Cys Glu Arg 1330 1335 1340 GlyPhe Ser Phe Lys Asn Ile Asp Leu Tyr Arg Ser Gln Ala Thr Glu 1345 13501355 1360 Phe Val Ile Asp Gly Asn Ser Leu Ile Pro Pro Phe Asn Ala IlePro 1365 1370 1375 Gly Leu Gly Thr Asn Val Ala Gln Ala Ile Val Arg AlaArg Glu Glu 1380 1385 1390 Gly Glu Phe Leu Ser Lys Glu Asp Leu Gln GlnArg Gly Lys Leu Ser 1395 1400 1405 Lys Thr Leu Leu Glu Tyr Leu Glu SerArg Gly Cys Leu Asp Ser Leu 1410 1415 1420 Pro Asp His Asn Gln Leu SerLeu Phe 1425 1430 185 199 PRT Thermus thermophilus 185 Thr Pro Lys GlyLys Asp Leu Val Arg His Leu Glu Asn Arg Ala Lys 1 5 10 15 Arg Leu GlyLeu Arg Leu Pro Gly Gly Val Ala Gln Tyr Leu Ala Ser 20 25 30 Leu Glu GlyAsp Leu Glu Ala Leu Glu Arg Glu Leu Glu Lys Leu Ala 35 40 45 Leu Leu SerPro Pro Leu Thr Leu Glu Lys Val Glu Lys Val Val Ala 50 55 60 Leu Arg ProPro Leu Thr Gly Phe Asp Leu Val Arg Ser Val Leu Glu 65 70 75 80 Lys AspPro Lys Glu Ala Leu Leu Arg Leu Gly Arg Leu Lys Glu Glu 85 90 95 Gly GluGlu Pro Leu Arg Leu Leu Gly Ala Leu Ser Trp Gln Phe Ala 100 105 110 LeuLeu Ala Arg Ala Phe Phe Leu Leu Arg Glu Met Pro Arg Pro Lys 115 120 125Glu Glu Asp Leu Ala Arg Leu Glu Ala His Pro Tyr Ala Ala Lys Lys 130 135140 Ala Leu Leu Glu Ala Ala Arg Arg Leu Thr Glu Glu Ala Leu Lys Glu 145150 155 160 Ala Leu Asp Ala Leu Met Glu Ala Glu Lys Arg Ala Lys Gly GlyLys 165 170 175 Asp Pro Trp Leu Ala Leu Glu Ala Ala Val Leu Arg Leu AlaArg Pro 180 185 190 Ala Gly Gln Pro Arg Val Asp 195 186 27 DNAArtificial Sequence Description of Artificial Sequence PCR primer 186gcccagtacc tcgcctccct cgagggg 27 187 27 DNA Artificial SequenceDescription of Artificial Sequence PCR primer 187 ggcccccttg gccttctcggcctccat 27 188 331 DNA Thermus thermophilus 188 agactcgagg ccctggagcgggagctggag aagcttgccc tcctctcccc acccctcacc 60 ctggagaagg tggagaaggtggtggccctg aggccccccc tcacgggctt tgacctggtg 120 cgctccgtcc tggagaaggaccccaaggag gccctcctgc gcctcaggcg cctcagggag 180 gagggggagg agcccctcaggctcctcggg gccctctcct ggcagttcgc cctcctcgcc 240 cgggccttct tcctcctccgggaaaacccc aggcccaagg aggaggacct cgcccgcctc 300 gaggcccacc cctacgccgccaagaaggcc a 331 189 110 PRT Thermus thermophilus 189 Arg Leu Glu AlaLeu Glu Arg Glu Leu Glu Lys Leu Ala Leu Leu Ser 1 5 10 15 Pro Pro LeuThr Leu Glu Lys Val Glu Lys Val Val Ala Leu Arg Pro 20 25 30 Pro Leu ThrGly Phe Asp Leu Val Arg Ser Val Leu Glu Lys Asp Pro 35 40 45 Lys Glu AlaLeu Leu Arg Leu Arg Arg Leu Arg Glu Glu Gly Glu Glu 50 55 60 Pro Leu ArgLeu Leu Gly Ala Leu Ser Trp Gln Phe Ala Leu Leu Ala 65 70 75 80 Arg AlaPhe Phe Leu Leu Arg Glu Asn Pro Arg Pro Lys Glu Glu Asp 85 90 95 Leu AlaArg Leu Glu Ala His Pro Tyr Ala Ala Lys Lys Ala 100 105 110 190 31 DNAArtificial Sequence Description of Artificial Sequence PCR primer 190gtggtgtcta gacatcataa cggttctggc a 31 191 27 DNA Artificial SequenceDescription of Artificial Sequence PCR Primer 191 gagggccacc accttctccaccttctc 27 192 25 DNA Artificial Sequence Description of ArtificialSequence PCR Primer 192 ctccgtcctg gagaaggacc ccaag 25 193 29 DNAArtificial Sequence Description of Artificial Sequence PCR primer 193cgcgaattca acgcsctcct caagacsct 29 194 31 DNA Artificial SequenceDescription of Artificial Sequence PCR primer 194 gacacttaac atatggtcatcgccttcacc g 31 195 38 DNA Artificial Sequence Description of ArtificialSequence PCR primer 195 gtgtgtgaat tcgggtcaac gggcgaggcg gaggaccg 38 19610 PRT Deinococcus radiodurans 196 Val Ile Leu Asn Pro Gly Ser Val GlyGln 1 5 10 197 10 PRT Methanococcus jannaschii 197 Tyr Leu Ile Asn ProGly Ser Val Gly Gln 1 5 10 198 10 PRT Thermotoga maritima 198 Leu ValLeu Asn Pro Gly Ser Ala Gly Arg 1 5 10 199 28 DNA Artificial SequenceDescription of Artificial Sequence PCR primer 199 ctggtgaacc cgggctccgtgggccagc 28 200 10 PRT Artificial Sequence Description of ArtificialSequence polypeptide 200 Leu Leu Val Asn Pro Gly Ser Val Gly Gln 1 5 10201 27 DNA Artificial Sequence Description of Artificial Sequence PCRprimer 201 ctcgaggagc ttgaggaggg tgttggc 27 202 9 PRT ArtificialSequence Description of Artificial Sequence polypeptide 202 Ala Asn ThrLeu Leu Lys Leu Leu Glu 1 5 203 32 PRT Deinococcus radiodurans 203 GlyPhe Gly Gly Val Gln Leu His Ala Ala His Gly Tyr Leu Leu Ser 1 5 10 15Gln Phe Leu Ser Pro Arg His Asn Val Arg Glu Asp Glu Tyr Gly Gly 20 25 30204 32 PRT Caenorhabditis elegans 204 Gly Phe Asp Gly Ile Gln Leu HisGly Ala His Gly Tyr Leu Leu Ser 1 5 10 15 Gln Phe Thr Ser Pro Thr ThrAsn Lys Arg Val Asp Lys Tyr Gly Gly 20 25 30 205 32 PRT Pseudomonasaeruginosa 205 Gly Phe Ser Gly Val Glu Ile His Ala Ala His Gly Tyr LeuLeu Ser 1 5 10 15 Gln Phe Leu Ser Pro Leu Ser Asn Arg Arg Ser Asp AlaTrp Gly Gly 20 25 30 206 32 PRT Archaeoglobus fulgidus 206 Gly Phe AspAla Val Gln Leu His Ala Ala His Gly Tyr Leu Leu Ser 1 5 10 15 Glu PheIle Ser Pro His Val Asn Arg Arg Lys Asp Glu Tyr Gly Gly 20 25 30 207 30DNA Artificial Sequence Description of Artificial Sequence PCR primer207 catcctggac tcggcccacc tcctcaccga 30 208 9 PRT Artificial SequenceDescription of Artificial Sequence polypeptide 208 Ile Leu Asp Ser AlaHis Leu Leu Thr 1 5 209 33 DNA Artificial Sequence Description ofArtificial Sequence PCR primer 209 gaggaggtag ccgtgggccg cgtggagctc cac33 210 11 PRT Artificial Sequence Description of Artificial Sequencepolypeptide 210 Val Glu Leu His Ala Ala His Gly Tyr Leu Leu 1 5 10 21132 DNA Artificial Sequence Description of Artificial Sequence PCR primer211 ggctttccca tatggctcta cacccggctc ac 32 212 29 DNA ArtificialSequence Description of Artificial Sequence PCR primer 212 gcgtggatccacggtcatgt ctctaagtc 29

What is claimed:
 1. An isolated Thermotoga delta prime subunit of a DNApolymerase III-type enzyme, the isolated delta prime subunit: (i)comprising the amino acid sequence of SEQ ID NO: 148; or (ii) beingencoded by a nucleic acid molecule hybridizing to the complement of SEQID NO: 147 under hybridization conditions comprising at most about 0.9Msodium citrate buffer at a temperature of at least about 37° C.
 2. Theisolated Thermotoga delta prime subunit according to claim 1 wherein theThermotoga species is Thermotoga maritima.
 3. The isolated Thermotogadelta prime subunit according to claim 2 wherein the delta prime subunitcomprises the amino acid sequence of SEQ ID NO:
 148. 4. The isolatedThermotoga delta prime subunit according to claim 1 wherein the deltaprime subunit is encoded by a nucleic acid molecule that hybridizes tothe complement of SEQ ID NO: 147 under hybridization conditionscomprising at most about 0.9M sodium citrate buffer at a temperature ofat least about 51° C.
 5. The isolated Thermotoga delta prime subunitaccording to claim 1 wherein the delta prime subunit is purified.
 6. Aclamp loader complex comprising the Thermotoga delta prime subunitaccording to claim
 1. 7. A DNA polymerase III-type enzyme complexcomprising the claim loader according to claim
 6. 8. A kit comprising: acontainer that contains therein either a deoxynucleoside triphosphate ora dideoxynucleoside triphosphate; and a container that contains thereinthe DNA polymerase III-type enzyme complex according to claim 7.